Basic knowledge of data structure and algorithm

Preface

Data structure and algorithm are one of the important standards for programmers' internal skills, and data structure is also used in various aspects. The industry has an program = data structure + algorithm . Various middleware developers and architects are working hard to optimize middleware, project structure and algorithms to improve operating efficiency and reduce memory usage. Here, data structures play a very important role. In addition, the data structure also contains some object-oriented ideas, so learning and mastering the data structure greatly improves the abstract ability of logical thinking.

Why learn data structures and algorithms? If you are still a student, then this course is compulsory, and the postgraduate entrance examination is basically a compulsory subject. The data structure and algorithm of finding a job in a large factory with serious internal volume is also a very important inspection point for interviews and written examinations. If you work with data structures and algorithms, it is also a very important manifestation of internal power improvement. For programmers, if they want to get satisfactory results, data structures and algorithms are essential skills!

data structure

concept

Data structure is the way a computer stores and organizes data. Data structure refers to a collection of data elements that have one or more specific relationships with each other. Under normal circumstances, a carefully selected data structure can bring higher operating or storage efficiency.

In short, a data structure is an efficient storage structure formed by a series of storage structures according to certain execution rules and certain execution algorithms. The well-known relational databases, non-relational databases, search engine storage, message queues, etc. are all good use of relatively large data structures. Of course, these application middleware should not only consider pure structural issues. Other factors such as actual os and network are also considered.

And for this column of data structure and algorithm. The first thing we programmers changed and mastered was the abstract data structure memory. It is a relatively single type of data structure, such as linear structure, tree, graph and so on.

`Related terms`

In the data structure and algorithm, data, data objects, data elements, data items many people are confused about the relationship. Stroke by drawing a picture, and then give you an example to share.

user information table users

id	name	sex
001	bigsai	man
002	smallsai	man
003	Caixu Kun	woman

users' pojo object

class users
{ 
     //略
     int id;
     String name;
     String sex;
}
//list和woman是数据
List<users>list;//数据对象list
List<users>woman;//数据对象woman
list.add(new users(001,"bigsai","man"));//添加数据元素 一个users由（001，bigsai，man）三个数据项组成 
list.add(new users(002,"smallsai","man"));//数据元素
list.add(new users(003,"菜虚鲲","woman"));//数据元素
woman.add(list.get(2));//003,"菜虚鲲","woman"三个数据项构成的一个数据元素

Data : Symbolic representation of objective things, referring to the collective name of all symbols that can be input into a computer and processed by a computer program. The three user information records in the above table are data (or there may only be one in multiple tables and multiple collections). These data are generally user input or custom constructed. Of course, some images and sounds are also data.

data element : The data element is the basic unit data. A data element consists of a number of data items! Think of it as a pojo object, or a record in the database. For example, the is a data element.

data item : The user fields/attributes are id , name , sex etc. These are the data items. The data item is the smallest indivisible field . It can be seen as a pojo object or a attribute/field value of a table (people).

data object : is a collection of data elements of the same nature. Is a subset of the data. For example, the users table, the list collection, and the woman collection above are all data objects. A single table or a collection can be a data object.

In general, the data range is the widest. All data is data, and the data object is just a collection of the same nature. This collection is a subset of the data, but it is not the basic unit of the data. The data element is the data. The basic unit. For example, the table cat and the table dog are both data, and then the table cat is a data object (because they both describe cat objects), but the basic unit of data is not cats and dogs, but each of their specific items, such as small Cat No. 1, Big Cat No. 2, Husky No. 1, Tibetan Mastiff No. 2 is the basic unit of data.

There are also data types, and abstract data types are also discussed below.

Data type

Atomic type: A type whose value cannot be divided. Such as int, char, double, float, etc.

structure type: a data type whose value can be subdivided into several components. For example, various structures of structure structure.

Abstract Data Type (ADT) : Abstract Data Type (ADT) is an algorithm that implements a storage structure including storing data elements and implements basic operations. It is possible to only study and use its structure without considering its implementation details. For example, when we use List, Map, Set, etc., we only need to understand its api and nature functions. The specific implementation may be a different solution, for example, the implementation of List has different options for array and linked list.

`Three elements`

logical structure logical relationship between data elements. The logical structure is divided into linear structure and non-linear structure. Linear structures are sequential lists, linked lists, and the like. Non-linearity refers to the structures of sets, trees, and graphs.

storage structure : the representation of the data structure in the computer (also called the image, also called the physical structure), the storage structure is mainly divided into sequential storage, chain storage, index storage and storage, hash (hash) storage These types of storage are briefly understood through the following picture (just for understanding, do not consider more):

data operations : operations imposed on the data include operations definition and realization, the definition of the operation is based on the logical structure, the realization of the operation is based on the storage structure.

What is easy to confuse here is the concept of logical structure and storage structure. For the 1609b5194b9dfc logical structure logical two words, the logical relationship is that the two have a data relationship regardless of the physical address relationship, such as linear structure and non-linear structure, it describes a group of data The method and form of contacting data . What is fancy is the function of the data structure. For example, the linear table is in order. I need an ordered set to use the linear table.

The storage structure is linked to the physical address. Because the same logical structure adopts different storage structures to realize the applicable scenarios and performance may be different. For example, the same linear table , there may be a variety of storage structure implementation methods. For example, sequence table and linked list (Arraylist, Linkedlist) have different storage structures, one is sequential storage (array) implementation, the other is chain storage (linked list) implementation. It is concerned with the relationship between the physical addresses of the computer's operation. But usually some data structures implemented by the same type of storage structure have some similar common points and shortcomings (linear easy to check and hard to insert, chain easy to insert and hard to check, etc.).

`Analysis of Algorithms`

The data structure related concepts are discussed above, and some concepts of algorithm analysis are described below.

Five important characteristics of the algorithm: is finite, deterministic, feasibility, input, and output . These can be understood from the literal meaning, where finiteness emphasizes that the algorithm cannot loop indefinitely when it has to end; and certainty is that every instruction has its meaning, the same input gets the same output; feasibility refers to each step of the algorithm It can be realized after several executions; input is 0 or more inputs (can be 0); output is 1 or more outputs (must have output).

A good algorithm usually focuses more on efficiency and space resource occupation (time complexity and space complexity). Usually the complexity is more described by a and rarely described by specific numbers.

`Space complexity`

concept : is a measure of the amount of storage space that an algorithm temporarily occupies during operation, denoted as S(n)=O(f(n))

The space complexity is actually relatively low in the measurement of algorithms (we often use data structures and algorithms that sacrifice space for time), but the importance of space complexity cannot be ignored. Regardless of whether it is a problem or actual project production memory, it is a huge indicator. This is especially true for Java. The memory itself is large. If the storage logic used is not good, it will take up more system resources and put pressure on the service.

In many cases, the algorithm sacrifices space for time (efficiency). For example, the well-known string matching String.contains() method, we all know that it is a brute force cracking, the time complexity is O(n^2), and no additional memory is needed. The KMP algorithm has a native brute force method in terms of efficiency and speed, but KMP needs to use other arrays ( next[] ) for tag storage operations. Space overhead is used. For another example, merge sort will also use the new array to perform step-by-step calculations in recursive division to improve efficiency, but the increase in memory overhead has little effect.

Of course, the maximum space cost of the algorithm cannot exceed the maximum value set by the jvm, which is generally 2G. (2147483645) If you open a two-dimensional array and multiple multi-dimensional data, do not open too much, which may result in heap OutOfMemoryError .

`time complexity`

concept : In computer science, the time complexity of an algorithm is a function, which qualitatively describes the running time of the algorithm. This is a function of the length of the string representing the input value of the algorithm. Time complexity is often expressed in big O notation, does not include the low-order term of this function and the first coefficient . When using this method, the time complexity can be said to be asymptotic, which examines the situation when the size of the input value approaches infinity.

Time complexity ranking : O(1) <O(logn) <O(n) <O(nlogn) <O(n^2) <O(n^3) <O(2^n) <O (n!) <O(n^n)

Common time complexity : For time complexity, many people's concept is rather vague. The following examples illustrate some of the time complexity.

O(1): constant function

a=15

O(logn): logarithmic function

for(int i=1;i<n;i*=2) Analysis: Suppose the execution of t times makes i=n ; there are 2^t=n; t=log2~n, which is log level and the time complexity is O(logn).
There are also typical binary search, extended Euclidean, and fast idempotent algorithms are all O(logn). It is a high-efficiency algorithm.

O(n): linear function

for (int i=0;i<n;i++)
It is relatively common and can solve most problems well.

O(nlogn):

for (int i=1;i<n;i++) for (int j=1;j<i;j*=2)
Common sorting algorithms are nlogn in many normal situations, such as fast sorting and merge sorting. The efficiency of this algorithm is mostly pretty good.

O(n^2)

for(int i=0;i<n;i++) for(int j=0;j<i;j++)
- In fact, the efficiency of O(n^2) cannot be flattered. For large data O(n^2) or even higher power, the execution effect will be poor.

Of course, if the same is n=10000, then the number and time of execution of the algorithm with different time complexity are also different.

specific	n	Number of executions
O(1)	10000	1
O(log2n)	10000	14
O( n^1/2)	10000	100
O(n)	10000	10000
O(nlog2 n)	10000	140000
O(n^2)	10000	100000000
O(n^3)	10000	1000000000000

Reducing the complexity of the algorithm will depend on the characteristics and advantages of the data structure, such as the search of the binary sort tree, the dynamic sorting of the line segment tree, etc. These data structures solve some problems and have very good performance. Others are solved by algorithmic strategies, such as sorting and bubble sorting, which are dumb and simple methods are O(n2), but smart methods such as fast sorting and merging can be O(nlogn). To become faster, you have to master more advanced data structures and more sophisticated algorithms.

Time complexity calculation Time complexity calculation is generally steps: 1. Find the sentence with the most execution times; 2. Calculate the order of magnitude of the sentence execution; 3. Use O to represent the result. And there are two rules:

Addition rules: If there are multiple parallel execution statements in the same program, then take the largest one, eg:

T(n)=O(m)+O(n)=max(O(m),O(n)); 
T(n)=O(n)+O(nlogn)=max(O(n),O(nlogn))=O(nlogn);

Multiplication rules: loop structure, time complexity is calculated according to multiplication, eg:

T(n)=O(m)*O(n)=O(mn)
T(n)=O(m)*O(m)=O(m^2)(两层for循环)

Of course, the time complexity of many algorithms is also related to the input data. There are also optimal time complexity (when the number of executions is the least), the worst time complexity (when the number of executions is the least), and the average time complexity. It has been specifically analyzed in the sorting algorithm, but we usually use the average time complexity of to measure the quality of an algorithm. ??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

`Data structure and algorithm learning`

After passing the introduction of the basic concepts of data structure and algorithm, in terms of learning data structure and algorithm, I personally wrote the classic data structure and algorithm learning process steps below, hoping to give you a reference:

`data structure`

Design and implementation of single-linked list (leader node, no-leader node) (addition, deletion, modification, check), double-linked list design and implementation
Stack design and implementation (array and linked list), queue design and implementation (array and linked list)
Binary tree concept learning, binary tree pre-order, middle-order, post-order traversal recursive, non-recursive realization, layer order traversal
Design and implementation of binary sort tree (insert and delete)
Heap (priority queue, heap sort)
AVL (Balanced) Tree Design and Implementation (Understanding and Implementation of Four Spin Modes)
Conceptual understanding of the principles of stretching trees and red-black trees
B, B+ principle concept understanding
Conceptual understanding of Huffman tree principle (greedy strategy)
Hash (hash table) principle concept understanding (several ways to resolve hash conflicts)
Union/disjoint set (optimization and path compression)
Graph theory topological sort
Graph theory dfs depth-first traversal, bfs breadth-first traversal
The shortest path Dijkstra algorithm, Floyd algorithm, spfa algorithm
Minimum spanning tree prim algorithm, kruskal algorithm
Other data structure line segment trees, suffix arrays, etc.

`Classical Algorithm`

Recursive algorithm (find factorial, Fibonacci, Tower of Hanoi problem)
Binary search
Divide and conquer algorithm (quick sorting, merge sorting, finding the nearest point, etc.)
Greedy algorithm (used more, interval selection problem, interval coverage problem)
Common dynamic programming (LCS (longest common subsequence) LIS (longest rising subsequence) knapsack problem, etc.)
Backtracking algorithm (classic eight queens problem, full permutation problem)
Frequently Asked Questions about Bit Operations (refer to Jianzhi offer and LeetCode questions)
Fast power algorithm (fast exponentiation, fast matrix power)
kmp and other string matching algorithms
All other number theory algorithms (Euclidean, extended Euclidean, Chinese remainder theorem, etc.)

I believe that after reading this article, you should have a good understanding of data structures and algorithms. Data structure and algorithm are very closely related. The data structure is to implement a certain algorithm, and the algorithm is the core purpose. Before learning data structure and algorithms, you can refer to books or blogs to understand its functions first, then study its operating principles, and then start actual combat (writing data structures or related topics). This level of gradual progress, you want to learn data structures and algorithms in depth. It doesn't work, you need a lot of code to fight. And there is no end to this road. Live to grow old, learn to grow old, and brush to grow old.

Original public number: bigsai The article has been included in Data structure and algorithm learning warehouse that the whole network is paying attention to Welcome star

See you next time!

Basic knowledge of data structure and algorithm

Preface

data structure

concept

`Related terms`

`Three elements`

`Analysis of Algorithms`

`Space complexity`

`time complexity`

`Data structure and algorithm learning`

`data structure`

`Classical Algorithm`

bigsai

`引用和评论`

栈和括号匹配问题，一文搞懂

如何对接韩国和日本股票数据源API

可视化图解算法19：递归基础

可视化图解算法29：合并二叉树

可视化图解算法35：在二叉树中找到两个节点的最近公共祖先（二叉树的最近公共祖先）

可视化图解算法01：为什么要学习数据结构与算法

理解 Golang 中的最大/最小堆、`heap` 与优先队列