Learn about the state-of-the-art at the interface between information theory and data science with this first unified treatment of the subject. Written by leading experts in a clear, tutorial style, and using consistent notation and definitions throughout, it shows how information-theoretic methods are being used in data acquisition, data representation, data analysis, and statistics and machine learning. Coverage is broad, with chapters on signal acquisition, data compression, compressive sensing, data communication, representation learning, emerging topics in statistics, and much more. Each chapter includes a topic overview, definition of the key problems, emerging and open problems, and an extensive reference list, allowing readers to develop in-depth knowledge and understanding. Providing a thorough survey of the current research area and cutting-edge trends, this is essential reading for graduate students and researchers working in information theory, signal processing, machine learning, and statistics.
Mixing up various disciplines frequently produces something that are profound and far-reaching. Cybernetics is such an often-quoted example. Mix of information theory, statistics and computing technology proves to be very useful, which leads to the recent development of information-theory based methods for estimating complicated probability distributions. Estimating probability distribution of a random variable is the fundamental task for quite some fields besides statistics, such as reliability, probabilistic risk analysis (PSA), machine learning, pattern recognization, image processing, neural networks and quality control. Simple distribution forms such as Gaussian, exponential or Weibull distributions are often employed to represent the distributions of the random variables under consideration, as we are taught in universities. In engineering, physical and social science applications, however, the distributions of many random variables or random vectors are so complicated that they do not fit the simple distribution forms at al. Exact estimation of the probability distribution of a random variable is very important. Take stock market prediction for example. Gaussian distribution is often used to model the fluctuations of stock prices. If such fluctuations are not normally distributed, and we use the normal distribution to represent them, how could we expect our prediction of stock market is correct? Another case well exemplifying the necessity of exact estimation of probability distributions is reliability engineering. Failure of exact estimation of the probability distributions under consideration may lead to disastrous designs. There have been constant efforts to find appropriate methods to determine complicated distributions based on random samples, but this topic has never been systematically discussed in detail in a book or monograph. The present book is intended to fill the gap and documents the latest research in this subject. Determining a complicated distribution is not simply a multiple of the workload we use to determine a simple distribution, but it turns out to be a much harder task. Two important mathematical tools, function approximation and information theory, that are beyond traditional mathematical statistics, are often used. Several methods constructed based on the two mathematical tools for distribution estimation are detailed in this book. These methods have been applied by the author for several years to many cases. They are superior in the following senses: (1) No prior information of the distribution form to be determined is necessary. It can be determined automatically from the sample; (2) The sample size may be large or small; (3) They are particularly suitable for computers. It is the rapid development of computing technology that makes it possible for fast estimation of complicated distributions. The methods provided herein well demonstrate the significant cross influences between information theory and statistics, and showcase the fallacies of traditional statistics that, however, can be overcome by information theory. Key Features: - Density functions automatically determined from samples - Free of assuming density forms - Computation-effective methods suitable for PC - density functions automatically determined from samples - Free of assuming density forms - Computation-effective methods suitable for PC
This book constitutes the refereed proceedings of the International Conference on Soft Computing in Data Science, SCDS 2016, held in Putrajaya, Malaysia, in September 2016. The 27 revised full papers presented were carefully reviewed and selected from 66 submissions. The papers are organized in topical sections on artificial neural networks; classification, clustering, visualization; fuzzy logic; information and sentiment analytics.
This book constitutes the post-conference proceedings of the 5th International Conference on Machine Learning, Optimization, and Data Science, LOD 2019, held in Siena, Italy, in September 2019. The 54 full papers presented were carefully reviewed and selected from 158 submissions. The papers cover topics in the field of machine learning, artificial intelligence, reinforcement learning, computational optimization and data science presenting a substantial array of ideas, technologies, algorithms, methods and applications.
Issues in Networks Research and Application: 2011 Edition is a ScholarlyEditions™ eBook that delivers timely, authoritative, and comprehensive information about Networks Research and Application. The editors have built Issues in Networks Research and Application: 2011 Edition on the vast information databases of ScholarlyNews.™ You can expect the information about Networks Research and Application in this eBook to be deeper than what you can access anywhere else, as well as consistently reliable, authoritative, informed, and relevant. The content of Issues in Networks Research and Application: 2011 Edition has been produced by the world’s leading scientists, engineers, analysts, research institutions, and companies. All of the content is from peer-reviewed sources, and all of it is written, assembled, and edited by the editors at ScholarlyEditions™ and available exclusively from us. You now have a source you can cite with authority, confidence, and credibility. More information is available at http://www.ScholarlyEditions.com/.
The volume presents innovations in data analysis and classification and gives an overview of the state of the art in these scientific fields and applications. Areas that receive considerable attention in the book are discrimination and clustering, data analysis and statistics, as well as applications in marketing, finance, and medicine. The reader will find material on recent technical and methodological developments and a large number of applications demonstrating the usefulness of the newly developed techniques.
This interdisciplinary text offers theoretical and practical results of information theoretic methods used in statistical learning. It presents a comprehensive overview of the many different methods that have been developed in numerous contexts.
A unique and comprehensive text on the philosophy of model-based data analysis and strategy for the analysis of empirical data. The book introduces information theoretic approaches and focuses critical attention on a priori modeling and the selection of a good approximating model that best represents the inference supported by the data. It contains several new approaches to estimating model selection uncertainty and incorporating selection uncertainty into estimates of precision. An array of examples is given to illustrate various technical issues. The text has been written for biologists and statisticians using models for making inferences from empirical data.
This book is the first cohesive treatment of ITL algorithms to adapt linear or nonlinear learning machines both in supervised and unsupervised paradigms. It compares the performance of ITL algorithms with the second order counterparts in many applications.
This textbook introduces a science philosophy called "information theoretic" based on Kullback-Leibler information theory. It focuses on a science philosophy based on "multiple working hypotheses" and statistical models to represent them. The text is written for people new to the information-theoretic approaches to statistical inference, whether graduate students, post-docs, or professionals. Readers are however expected to have a background in general statistical principles, regression analysis, and some exposure to likelihood methods. This is not an elementary text as it assumes reasonable competence in modeling and parameter estimation.
The two volume set LNCS 3686 and LNCS 3687 constitutes the refereed proceedings of the Third International Conference on Advances in Pattern Recognition, ICAPR 2005, held in Bath, UK in August 2005. The papers submitted to ICAPR 2005 were thoroughly reviewed by up to three referees per paper and less than 40% of the submitted papers were accepted. The first volume includes 73 contributions related to Pattern Recognition and Data Mining (which included papers from the tracks of pattern recognition methods, knowledge and learning, and data mining); topics addressed are pattern recognition, data mining, signal processing and OCR/ document analysis. The second volume contains 87 contributions related to Pattern Recognition and Image Analysis (which included papers from the applications track) and deals with security and surveillance, biometrics, image processing and medical imaging. It also contains papers from the Workshop on Pattern Recognition for Crime Prevention.
Medical Informatics (MI) is an emerging interdisciplinary science. This book deals with the application of computational intelligence in MI. Addressing the various issues of medical informatics using different computational intelligence approaches is the novelty of this edited volume. This volume comprises of 15 chapters selected on the basis of fundamental ideas/concepts including an introductory chapter giving the fundamental definitions and some important research challenges.
Qualitative Comparative Analysis (QCA) and other set-theoretic methods distinguish themselves from other approaches to the study of social phenomena by using sets and the search for set relations. In virtually all social science fields, statements about social phenomena can be framed in terms of set relations, and using set-theoretic methods to investigate these statements is therefore highly valuable. This book guides readers through the basic principles of set theory and then on to the applied practices of QCA. It provides a thorough understanding of basic and advanced issues in set-theoretic methods together with tricks of the trade, software handling and exercises. Most arguments are introduced using examples from existing research. The use of QCA is increasing rapidly and the application of set-theory is both fruitful and still widely misunderstood in current empirical comparative social research. This book provides the comprehensive guide to these methods for researchers across the social sciences.
The development of effective methods for the prediction of ontological annotations is an important goal in computational biology, yet evaluating their performance is difficult due to problems caused by the structure of biomedical ontologies and incomplete annotations of genes. This work proposes an information-theoretic framework to evaluate the performance of computational protein function prediction. A Bayesian network is used, structured according to the underlying ontology, to model the prior probability of a protein's function. The concepts of misinformation and remaining uncertainty are then defined, that can be seen as analogs of precision and recall. Finally, semantic distance is proposed as a single statistic for ranking classification models. The approach is evaluated by analyzing three protein function predictors of gene ontology terms. The work addresses several weaknesses of current metrics, and provides valuable insights into the performance of protein function prediction tools.
This Springer Brief represents a comprehensive review of information theoretic methods for robust recognition. A variety of information theoretic methods have been proffered in the past decade, in a large variety of computer vision applications; this work brings them together, attempts to impart the theory, optimization and usage of information entropy. The authors resort to a new information theoretic concept, correntropy, as a robust measure and apply it to solve robust face recognition and object recognition problems. For computational efficiency, the brief introduces the additive and multiplicative forms of half-quadratic optimization to efficiently minimize entropy problems and a two-stage sparse presentation framework for large scale recognition problems. It also describes the strengths and deficiencies of different robust measures in solving robust recognition problems.
All papers were peer-reviewed. For over 25 years the MaxEnt workshops have explored Bayesian and Maximum Entropy methods in scientific, engineering, and signal processing applications. This proceedings volume covers all aspects of probabilistic inference such as techniques, applications, and foundations. Applications include physics, space science, earth science, biology, imaging, graphical models and source separation.
Neural networks provide a powerful new technology to model and control nonlinear and complex systems. In this book, the authors present a detailed formulation of neural networks from the information-theoretic viewpoint. They show how this perspective provides new insights into the design theory of neural networks. In particular they show how these methods may be applied to the topics of supervised and unsupervised learning including feature extraction, linear and non-linear independent component analysis, and Boltzmann machines. Readers are assumed to have a basic understanding of neural networks, but all the relevant concepts from information theory are carefully introduced and explained. Consequently, readers from several different scientific disciplines, notably cognitive scientists, engineers, physicists, statisticians, and computer scientists, will find this to be a very valuable introduction to this topic.
System-Theoretic Methods in Economic Modelling II complements the editor's earlier volume, bringing together current research efforts integrating system-theoretic concepts with economic modelling processes. The range of papers presented here goes beyond the long-accepted control-theoretic contributions in dynamic optimization and focuses on system-theoretic methods in the construction as well as the application stages of economic modelling. This volume initiates new and intensifies existing debate between researchers and practitioners within and across the disciplines involved, with the objective of encouraging interdisciplinary research. The papers are split into four sections - estimation, filtering and smoothing problems in the context of state space modelling; applying the state space concept to financial modelling; modelling rational expectation; and a miscellaneous section including a follow-up case study by Tse and Khilnani on their integrated system model for a fishery management process, which featured in the first volume.