Intro to Data - 04-06 - Composite Examples
Jan 1, 2020 22:55 · 437 words · 3 minute read
In data science, we have several common composite data types. We’re going to learn about the ones that you’re most likely to encounter. First, we have the composite data types that represent homogenous data. First, we have a vector, also known as an array, which is a one- dimensional sequence of homogenous data. Vectors are used to store a list of elements that are all of the same data type.
00:23 - For example, character strings, which we discussed earlier, are a sequence of characters stored in a vector. Second, we have a matrix, which is a two-dimensional grid of homogenous data. Matrices are typically used to store and process groups of related numbers using a set of mathematical operations known as matrix algebra. Finally, we have a tensor, which is a three-dimensional cube or n-dimensional hypercube of homogenous data. Tensors are typically used to create deep neural networks in machine learning, which is where the deep-learning framework TensorFlow gets its name.
00:54 - Next, we have composite data types that represent tabular data. First, we have a dictionary, which is a two-column table that stores a list of key-value pairs. A dictionary, also known as a look-up table, is used to quickly retrieve data by a unique identifier. Second, we have a table. A table stores data as a set of rows and columns. Tables are the most common way you will encounter structured data in data science. So we’ll discuss tabular data in much more detail in the next module. Next, we have composite data types that represent semi-structured data. First, we have a tree, which organizes data as a set of nodes and branches. Trees are used to represent hierarchical data (i.e. data that are organized into parent-child relationships).
01:37 - Second, we have a graph, which organizes data as a set of nodes and edges. Graphs are used to represent a network of data, representing each item as a node and each relationship as an edge. Finally, we have composite data types to represent multimedia data. For example: - A body of text is essentially a long vector of characters. - Images are represented as a two-dimensional matrix of pixels. - Audio is represented as a one-dimensional vector of sound waves over time. - Video is represented as sound and a set of images moving frame-by- frame over time. - and shape data is a graph of points, lines, and polygons used to construct geometric structures like maps and 3D objects. Many more composite data types exist in data science; however, these are the ones that you are most likely to encounter. .