# Optimization Approaches for Solving String Selection Problems (SpringerBriefs in Optimization)

## Elisa Pappalardo, Giovanni Stracquadanio

Language: English

Pages: 49

ISBN: 1461490529

Format: PDF / Kindle (mobi) / ePub

Optimization Approaches for Solving String Selection Problems provides an overview of optimization methods for a wide class of genomics-related problems in relation to the string selection problems. This class of problems addresses the recognition of similar characteristics or differences within biological sequences. Specifically, this book considers a large class of problems, ranging from the closest string and substring problems, to the farthest string and substring problems, to the far from most string problem. Each problem includes a detailed description, highlighting both biological and mathematical features and presents state-of-the-art approaches. This Brief provides a quick introduction of optimization methods for string selection problems for young scientists and a detailed description of the mathematical and computational methods developed for experts in the field of optimization who want to deepen their understanding of the string selection problems. Researchers, practitioners and graduate students in the field of Computer Science, Operation Research, Mathematics, Computational Biology and Biomedicine will find this book useful.

programming problems can be solved efficiently in the worst case, integer programming problems are NP-hard. This indicates that the worst-case time of any algorithm increases exponentially with the size of the problem, unless P = NP. A special case of IP is represented by 0-1 integer linear programming, known also as binary programming (BP), where variables are required to be 0 or 1. When only some of the unknown variables are required to be integer, the problems is called mixed integer

Hamming distance value have higher probability of being accepted for the next iteration of the algorithm. Another metaheuristic approach applied to solve the CSP consists of the antcolony optimization algorithm (ACO) [20,21] in [23]. ACO metaheuristic represents a transposition of real ants behavior to artificial intelligence and has been inspired by the observation of real ant colonies, where the behavior of each single ant is directed to the survival of the whole colony. In particular, in his

no comparison can be made with other approaches, since there is no literature on the experimental results for the CMSP. A slightly different problem considers the positions in a string as outliers, rather than the strings themselves, and it is known as the Most Strings with Few Bad Columns Problem [9]. Informally, the objective here is to maximize the number of strings with at most k columns having entries not-all-equal. The problem is proved to be NP-hard and APX-hard,3 which means that if P ¤

algorithm for P [3]. 1, there 4.6 The Far From Most String Problem 43 Definition 15 (Far From Most String Problem). Given a threshold t and the input set S , a string s must be found maximizing the variable x such that dffms .x; s i / t, for s i 2 P Â S , and jP j D x. Despite the similarity with the FSP, the FFMSP is much harder to approximate [25, 67]. Due to its hardness, polynomial-time algorithms do not provide any constant guarantee of approximation; for such reason, several heuristic

sequencing, to set the fundamentals to understand the importance of string selection algorithms. 1.1 Introduction The first draft of the euchromatic sequence of the human genome in 2001 by the International Human Genome Sequencing Consortium [4] and Celera Genomics [10] revolutionized the field of biology, providing an extraordinary source of information on the structure and function of a human cell. Determining the sequence of an organism is one of the most challenging interdisciplinary