DEV Community

Cover image for SQL FOR DATASCIENCE
Simone Kamande
Simone Kamande

Posted on

SQL FOR DATASCIENCE

Image description

WHAT IS SQL
SQL basically stands for Structure Query Language.it is designed to retrieve and manage data that is held in a daabase.furthermore, SQL is the standard languagefor relational database management systems.Ideally it is sort pf a phonebook,where you use it to find data, call it up retrieve it and work on it.

WHY SQL:

  • SQL can execute queries againist a database.
  • SQL can create a new database
  • SQL can delete records from a database
  • SQL can insert records from a database
  • SQL can set permissions on tables, procedures and views.
  • /very essential in data science/``

** TYPES OF SQL SERVERS**

Microsoft SQL Server: This is a popular relational database management system (RDBMS) that is widely used in enterprise applications and supports a range of operating systems, including Windows, Linux, and macOS. It includes a wide range of features and tools for managing and analyzing data, and also supports integration with other Microsoft technologies such as .NET and Visual Studio.

Oracle Database: This is another popular RDBMS that is widely used in enterprise environments. It offers a range of advanced features such as high-availability clustering, advanced security options, and support for large-scale data processing.

MySQL: This is an open-source RDBMS that is widely used for web applications and other types of software development. It offers a range of features and tools for managing and analyzing data, and also supports integration with other programming languages such as PHP and Python.

PostgreSQL: This is another open-source RDBMS that is widely used for web applications and other types of software development. It offers advanced features such as support for advanced data types, full-text search, and geospatial data.
_
SQLite:_ This is a lightweight, file-based SQL server that is widely used in embedded systems and mobile applications. It offers a simple, easy-to-use interface for managing and querying data, and can be a good choice for small-scale applications.

Overall, the different types of SQL servers are designed to meet different needs and use cases, from small-scale web applications to large-scale enterprise systems. When choosing which type of SQL server to use, it's important to consider factors such as the size and complexity of your application, your budget, and the level of support and features you need.

_ ADVANTAGES OF SQL_

  • Simplicity: SQL is a relatively easy-to-learn language, with a simple syntax that allows you to create, modify, and query databases using only a few basic commands.

  • Flexibility: SQL is a highly versatile language that can be used with many different types of databases, including relational databases, NoSQL databases, and cloud-based databases.

  • Scalability: SQL is designed to handle large amounts of data and can be used to build highly scalable systems that can handle millions or even billions of records.

  • Security: SQL provides a range of security features that can help protect data from unauthorized access, including access controls, encryption, and data masking.

  • Integration: SQL can be integrated with other programming languages and technologies, allowing you to easily incorporate data from different sources into your applications.

  • Standardization: SQL is a widely adopted standard for managing and analyzing data, which means that it's supported by a large community of developers, vendors, and tools.

Overall, the advantages of using SQL make it a popular choice for managing and analyzing data, particularly in enterprise and web applications.
** DISADVANTAGES OF SQL**

  • Steep Learning Curve: While the basics of SQL can be relatively easy to learn, mastering the language and advanced concepts like stored procedures, triggers, and complex queries can take time and effort.
  • Relational Data Model: SQL is designed around the relational data model, which may not be the best fit for all types of data or applications.
  • Scalability: While SQL is generally scalable, managing large databases can become complex and require significant hardware and software resources.
  • Limited Functionality: SQL is not a full-featured programming language and may not offer all the capabilities needed for complex data analysis or application development.
  • Security Risks: As with any database system, SQL can pose security risks if not properly secured, including the risk of SQL injection attacks.
  • Vendor Lock-In: Different database vendors may implement SQL differently, which can make it difficult to switch databases or migrate data to a different system.

Overall, while the advantages of SQL often outweigh the disadvantages, it's important to consider the potential drawbacks and assess whether SQL is the best choice for your specific needs and use case.
SQL (Structured Query Language) is a programming language used to maintain and manage data stored in relational databases.

key to note that after every code a (;)must be used in order for the code to function

Some Important Concepts of Relational Databases
1). Relational Database: A relational database is a database that stores data in one or more tables.

2). Table: A table is a collection of rows and columns. A table is also called a relation in the database world.

3). Column: A column is a set of data values of a particular type.

4). Row: A row is a single record in a table.

All data stored in the relational database needs to be of a particular type. The most common ones are:
INTEGER, TEXT, REAL, DATE (yyyy-mm-dd)

5). Statement: A statement is a line of text that is recognized as a valid command by the database. Statements need to end with a semicolon(;).

Basic HTML Commands:

`
CREATE TABLE planets(
id INTEGER,
name TEXT,
distance INTEGER
);
`

CREATE TABLE is a new clause that allows the user to create a new table. planets is the name of the table. (id INTEGER, name TEXT, distance INTEGER) is a list of columns along with their data type.

`

INSERT INTO planets(id, name, distance) values (3, 'Earth', 300);
`

INSERT INTO is a clause used to add rows into a table. planets is the name of the table followed by the columns where data is being added. values keyword allows us to add data into the table.

Let’s add some more data to add life to the table.

`
INSERT INTO planets(id, name, distance) values (1, 'Mercury', 100);
-INSERT INTO planets(id, name, distance) values (2, 'Venus', 200);
INSERT INTO planets(id, name, distance) values (4, 'Mars', 400);
INSERT INTO planets(id, name, distance) values (5, 'Jupiter', 500);-
`

Now let’s learn to make a query.

`
SELECT name FROM planets;
`

SELECT is used to query data from any database, which we would be using a lot. name is the column we want to retrieve and planets is the table name.

`

SELECT * FROM planets;
-- selects all columns from the table.
`

What if we forgot to add a column to the table while creating it? SQL got us covered!

`
ALTER TABLE planets ADD COLUMN color TEXT;
`

ALTER TABLE is used to add a new column to the planets table. color is the name of the column with the data type of TEXT.

`
UPDATE planets
SET color = 'blue'
WHERE name = 'Earth';
-- where id = 3;
-- where distance = 300;
-- can also be used
`

UPDATE is used to update records/rows in a table. color is the column name, name is used to identify the row where the update has to take place.

`

DELETE FROM planets WHERE color IS NULL;
`

DELETE is used to remove one or more records from the table.
**
BASIC FUNCTIONS USED IN SQL**
= means equl to.

means greater than
< means lesser than
=greater than or equal to
<=lesser than or equal to
<> not equal , but some !=
BETWEEN - specifies values that are within the set condtions
LIKE= searching for a pattern
IN = specifies multiple possible values

AND displays the result if both the conditions have been met
OR displays a condtion if either a conditon has been met
ORDER BY sorts using a criteria fro one to the next
ORDER BY BOTH -it will order using the first criteria and if the first criteria looks similar for some values it will start sorting using the second condition

KEY COMMANDS WHILE EXECUTING RELATIONSHIPS IN DATABASES

Inner Join: This type of join returns only the rows that have matching values in both tables.

Left Join (or Left Outer Join): This type of join returns all the rows from the left table, along with any matching rows from the right table. If there are no matching rows in the right table, the result will contain NULL values for the right table columns.
_
Right Join _(or Right Outer Join): This type of join returns all the rows from the right table, along with any matching rows from the left table. If there are no matching rows in the left table, the result will contain NULL values for the left table columns.

Full Join (or Full Outer Join): This type of join returns all the rows from both tables, along with NULL values for any columns that do not have a matching row in the other table.

Cross Join (or Cartesian Product): This type of join returns all possible combinations of rows from the two tables.

With this its a headstart on how to start sql basically they can be connected to python and github for further analysis ,but i have found it user friendly and easy to understand .

                                            -simon-
Enter fullscreen mode Exit fullscreen mode

Top comments (0)