<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: shubham chaudhari</title>
    <description>The latest articles on DEV Community by shubham chaudhari (@shubh28698).</description>
    <link>https://dev.to/shubh28698</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F400342%2Ff1c3b4e7-69c2-49ab-83c5-1a7909d2c115.jpg</url>
      <title>DEV Community: shubham chaudhari</title>
      <link>https://dev.to/shubh28698</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shubh28698"/>
    <language>en</language>
    <item>
      <title>Tools you can master as data science beginners</title>
      <dc:creator>shubham chaudhari</dc:creator>
      <pubDate>Wed, 17 Jun 2020 13:38:53 +0000</pubDate>
      <link>https://dev.to/shubh28698/tools-you-can-master-as-data-science-beginners-2n9m</link>
      <guid>https://dev.to/shubh28698/tools-you-can-master-as-data-science-beginners-2n9m</guid>
      <description>&lt;p&gt;&lt;em&gt;We're entering a new world in which data may be more important than software.” – Tim O'Reilly, founder, O'Reilly Media&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Data is proving to be more influencial nowadays. The reason behind it is that, it helps the organizations and tech industries to decide their strategies for deploying products or services in the market which can give them huge revenue. Data gives us some very meaningful and important insights which helps an individual to figure out the pattern or mindset of the general public indirectly without asking them.&lt;/p&gt;

&lt;p&gt;Data has not only became important for IT sector but also other ones like entertainment, healthcare, banking, automobiles etc. Almost everyone is in need of it. But in order to make the data useful and arrive at particular outcome, it has to go particular procedure or approach to arrive at outcome. Surely, all of this is gonna require performing some complex tasks and is very time consuming. &lt;/p&gt;

&lt;p&gt;But to overcome these constraints and to reach to accurate result, tools always helps, no matter whatever may be the technology stack.&lt;/p&gt;

&lt;p&gt;So here you can find some tools you can get over if you are starting off your journey as a data scientist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Excel&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Tool that every child knows when he/she was in school. It is very simple yet very important tool for data scientists. Almost every organization, whether it is big or small is using it. You can say that it has terrible respect. Everyone before can know it as simple spreadsheet tool, but it can be a very powerful weapon for data scientist and here are some reasons.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One can make and view datasets which are the heart of data science&lt;/li&gt;
&lt;li&gt;Extension available in python language&lt;/li&gt;
&lt;li&gt;Data analysis for getting quick insights&lt;/li&gt;
&lt;li&gt;Dashboard can be generated&lt;/li&gt;
&lt;li&gt;Availability of mathematical operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Tableau&lt;/strong&gt;&lt;br&gt;
Tableau is a powerful and fastest growing data visualization tool used nowadays almost in all industries. It acts as an data visualization tool for data scientist. It is very easy to learn and can be mastered if practiced thoroughly. It is available as tableau public and tableau desktop and has many exciting courses and offers for students. The tool has its qualities like&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Perfect for data analysis as it has many rich features that can make visuals more attractive.&lt;/li&gt;
&lt;li&gt;No coding knowledge required&lt;/li&gt;
&lt;li&gt;best when working with big data&lt;/li&gt;
&lt;li&gt;Many options to secure data without scripting&lt;/li&gt;
&lt;li&gt;has its own server&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Apache Spark&lt;/strong&gt;&lt;br&gt;
Apache Spark™ is a unified analytics engine for large-scale data processing. It has many features and subtools which can ease the work of data scientists resulting in time saving and efficient coding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Currently provides APIs in Scala, Java, and Python, with support for other languages (such as R) on the way&lt;/li&gt;
&lt;li&gt;Integrates well with the Hadoop ecosystem and data sources (HDFS, Amazon S3, Hive, HBase, Cassandra, etc.)&lt;/li&gt;
&lt;li&gt;Run workloads faster&lt;/li&gt;
&lt;li&gt;Combine SQL, streaming, and complex analytics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. Jupyter&lt;/strong&gt;&lt;br&gt;
Jupyter is development environment platform for data scientists. Jupyter provides multi-language interactive computing environments. Its Notebook, an open source web application, allows data scientists to create and share documents containing live code, equations, visualizations, and explanatory text.&lt;br&gt;
Some features of jupyter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configure and arrange the user interface to support a wide range of workflows in data science, scientific computing, and machine learning&lt;/li&gt;
&lt;li&gt;Use interactive widgets to manipulate and visualize data in real time&lt;/li&gt;
&lt;li&gt;extensible and modular: write plugins that add new components and integrate with existing ones&lt;/li&gt;
&lt;li&gt;Supports programming languages including popular data science languages like Python, R, Julia&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No doubt there might be lot more tools, but at basic level if you can learn and master these tools, then your pathway towards becoming data scientist may be of some ease.&lt;/p&gt;

&lt;p&gt;here i am including some learning resources links so that it can be easily accessible for all data science enthusiasts.&lt;/p&gt;

&lt;p&gt;1.&lt;a href="https://www.youtube.com/playlist?list=PLlz0muypSBNbd6VbNdU9cCzCQx8_dU7Hy"&gt;https://www.youtube.com/playlist?list=PLlz0muypSBNbd6VbNdU9cCzCQx8_dU7Hy&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2.&lt;a href="https://www.tableau.com/learn/training/20202"&gt;https://www.tableau.com/learn/training/20202&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;3.For apache spark, You can refer to Srivatsan Srinivasan's AIEngineering channel of youtube&lt;br&gt;
&lt;a href="https://youtu.be/pEi-Ak5l00A"&gt;https://youtu.be/pEi-Ak5l00A&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Any feedback would be greatly appreciated.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>nlp</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Basic pre-requisites for data scientists</title>
      <dc:creator>shubham chaudhari</dc:creator>
      <pubDate>Thu, 04 Jun 2020 18:19:03 +0000</pubDate>
      <link>https://dev.to/shubh28698/basic-pre-requisites-for-data-scientists-31b5</link>
      <guid>https://dev.to/shubh28698/basic-pre-requisites-for-data-scientists-31b5</guid>
      <description>&lt;p&gt;A lot of technological advancements and its benefits in data science field has attracted lot of individuals to make career in data science. It is always a good thing in tech field to know essentials while proceeding to pursue career in particular tech side.This eases and help the individual to decide the pathway till the target and gives him an idea about "How much efforts he has to put in" to reach to the final goal. So this post aims to provide a piece of information about the essentials to become data scientist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Approach and understanding&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;According to me, the most important thing in tech field is to understand and figure out the way to approach to programming problem. &lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Once an individual gets to know the approach, then he/she can easily solve the problem. One can find it easy to design an algorithm with this quality.&lt;br&gt;
for eg. if i have to solve an NLP problem, then my approach would be like this,&lt;/p&gt;

&lt;p&gt;a] Data gathering&lt;br&gt;
   b] Data cleaning&lt;br&gt;
   c] Figure out good representation&lt;br&gt;
   d] Classification&lt;br&gt;
   e] Inspection&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And if we come up to the programming thing, it can be learned easily because its all of syntax and different methods which you find on the internet.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;2. Slight business understanding&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To jump up in data science, it is very important that one should have basic business understanding and be aware about " for which business they are going to work", because the problems that are going to be solved with the help of data science varies from business to business.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For eg. if i am working for zomato or swiggy, then my possible role as data scientist will be to analyze the reviews of customers about the food and delivery services, which can help the company to make the further improvements. So i must have food delivery industry kind of sense.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;3. Languages like python, R&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The versatility of these languages make them one of the most used languages for data science. It can take various forms of data, prepare it for processing and with the help of algorithm, can provide particular insights and results&lt;/li&gt;
&lt;li&gt;Besides, there are massive libraries available for most of the processes which eases the work of data scientists and make the operations perform in no time&lt;/li&gt;
&lt;li&gt;provide functionality for mathematics and statistics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, it is very important to have knowledge of these languages.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;4. Familiarity with machine learning algorithms&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It is also considered as equally weighted as of other pre-requisites, because ML algorithms help in predicting the possible results&lt;/li&gt;
&lt;li&gt;ML algorithm help in drawing important aspects from the data use the concepts of mathematics and statistics&lt;/li&gt;
&lt;li&gt;It is expected to atleast know the workings of algorithms, so then they can be easily implemented&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;5. Visualization tools(Tableau, PowerBI)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;These tools are very helpful for data analysts which can help them to draw and visualize data, so that they can design particular strategy for organizations based on results from visualizations&lt;/li&gt;
&lt;li&gt;The tools are handier for data manipulations and attractive visualizations can be designed with the help of these tools&lt;/li&gt;
&lt;li&gt;These tools can help to monitor businesses and get instant rich and wonderful dashboards on any devices&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So learning these tools can help you remain one step ahead.&lt;/p&gt;

&lt;p&gt;Keeping these things in mind would help you to progress towards field of data science and can help you in designing the pathway.&lt;/p&gt;

&lt;p&gt;Any feedback or suggestions would be greatly appreciated.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>nlp</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Top Library setups for Data Science beginners</title>
      <dc:creator>shubham chaudhari</dc:creator>
      <pubDate>Tue, 02 Jun 2020 18:01:19 +0000</pubDate>
      <link>https://dev.to/shubh28698/top-library-setups-for-data-science-beginners-h6a</link>
      <guid>https://dev.to/shubh28698/top-library-setups-for-data-science-beginners-h6a</guid>
      <description>&lt;p&gt;Data Science has been a boon for many of applications in the world,  whether it could be in the area of healthcare, education, entertainment or in industrial sectors for supply chain logistics and many more things. Many of students and professional are aspiring or making career transitions in data science sector. So this is the post for them to give them a small start regarding libraries. &lt;/p&gt;

&lt;p&gt;This post will reflect some of the important libraries for beginners along with installation commands and small introductions, so that they do not have to put on much time to browse the internet and can find the things at one place.&lt;/p&gt;

&lt;p&gt;The only prerequisite for libraries is Python itself.&lt;/p&gt;

&lt;p&gt;All the installation commands should be executed in command prompt&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Numpy&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Powerful N-dimensional arrays&lt;/li&gt;
&lt;li&gt;Numerical computing tools&lt;/li&gt;
&lt;li&gt;Interoperable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Installation commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For windows:&lt;br&gt;
    &lt;strong&gt;pip install numpy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For linux:&lt;br&gt;
    &lt;strong&gt;$ sudo apt install python-numpy&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;2. Scipy&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Opensource&lt;/li&gt;
&lt;li&gt;Contains modules for optimization, linear algebra, integration,special functions&lt;/li&gt;
&lt;li&gt;Depends on numpy and imports many numpy functions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Installation commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For windows:&lt;br&gt;
    &lt;strong&gt;pip install scipy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For linux:&lt;br&gt;
    &lt;strong&gt;$ sudo apt-get install python-pip&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
    &lt;strong&gt;$ sudo pip install numpy scipy&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;3. Pandas&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Contains high-level data structures and manipulation tools &lt;/li&gt;
&lt;li&gt;For data manipulation and analysis&lt;/li&gt;
&lt;li&gt;fast, flexible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Installation commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For windows:&lt;br&gt;
    &lt;strong&gt;pip install pandas&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For linux:&lt;br&gt;
    &lt;strong&gt;$ pip3 install pandas&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;4. Scikit-learn&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simple and efficient tools for predictive data analysis&lt;/li&gt;
&lt;li&gt;lot of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction&lt;/li&gt;
&lt;li&gt;Features various algorithm like support vector machine, random forests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Installation commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For windows:&lt;br&gt;
    &lt;strong&gt;pip install scikit-learn&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For linux:&lt;br&gt;
    &lt;strong&gt;$ sudo pip install scikit-learn&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;5. NLTK&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Contains text processing libraries for tokenization, parsing, classification, stemming, tagging and semantic reasoning.&lt;/li&gt;
&lt;li&gt;Used for developing applications and services that are able to understand human languages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Installation commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For windows:&lt;br&gt;
    &lt;strong&gt;pip3 install nltk&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For linux:&lt;br&gt;
    &lt;strong&gt;$ sudo apt-get install python-numpy python-nltk&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;6. Matplotlib&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Comprehensive library for creating static, animated, and interactive visualizations&lt;/li&gt;
&lt;li&gt;Develop publication quality plots with just a few lines of code&lt;/li&gt;
&lt;li&gt;Use interactive figures that can zoom, pan, update&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Installation commands&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For windows:&lt;br&gt;
    &lt;strong&gt;python -m pip install -U pip&lt;/strong&gt;&lt;br&gt;
    &lt;strong&gt;python -m pip install -U matplotlib&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For linux:&lt;br&gt;
    &lt;strong&gt;$ sudo apt-get install python3-matplotlib&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;This post might prove to be helpful for all of the data science enthusiasts.&lt;/p&gt;

&lt;p&gt;Any feedback would be much appreciated.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>nlp</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
