info@stepindia.net

Data Analytics

Candidates for this exam are seeking to prove introductory knowledge of how to responsibly manipulate, analyze, and communicate findings of data analysis.
Candidates should have at least 150 hours of instruction or hands-on experience with data manipulation, analysis, visualization, and communication. They should be familiar with general data concepts, data-related laws, and responsible analytics practices.

To be successful on the test, the candidate is also expected to have the following prerequi site knowledge and skills:

  • 8th grade reading skills
  • Critical thinking and problem-solving skills
  • Digital literacy skills, including the ability to research, create content, and solve problems using technology
  • Algebra I

1. Data Basics

    1.1 Define the concept of data
    1.2 Describe basic data variable types
    • Boolean, numeric, string
    1.3 Describe basic structures used in data analytics
    • Tables, rows, columns, lists
    1.4 Describe data categories
    • Qualitative, quantitative, structured, unstructured, metadata, big data

2. Data Manipulation

    2.1 Import, store, and export data
    • Fundamental understanding of ETL (extract, transform and load) processes, data manipulation tools (SQL, R, Python, Microsoft Excel including aspects of Power Query), and common data storage file formats (delimited data files, XML, JSON)
    2.2 Clean data
    • Purpose and common practices (handling NULL, special characters, trimming spaces, inconsistent formatting, removing duplicates, imputing data, etc.); validating data
    2.3 Organize data
    • Purpose and common practices (handling NULL, special characters, trimming spaces, inconsistent formatting, removing duplicates, imputing data, etc.); validating data
    2.4 Aggregate data
    • Purpose and common practices (grouping, joining/merging, summarizing, pivoting, etc.)

3. Data Analysis

    3.1 Describe and differentiate between types of data analysis
    • Descriptive analysis, diagnostic analysis, hypothesis testing, predictive analysis, prescriptive analysis
    3.2 Describe and differentiate between data aggregation and interpretation metrics
    • Searching, filtering, unique values, aggregate functions such as Sum, Max, Min, Count, Avg/Mean, Mode, Median, Std Dev
    3.3 Describe and differentiate between exploratory data analysis methods
    • Identify data relationships, describe data drilling concepts (granularity, etc.), describe data mining concepts (anomalies, correlation analysis, patterns, outliers, etc.)
    3.4 Evaluate and explain the results of data analyses
    • Calculate trends, determine expected values, interpret results of predictive models, p-values, t-tests, and regression analyses
    3.5 Define and describe the role of artificial intelligence in data analysis
    • Define artificial intelligence, machine learning, and algorithm; describe how AI is used in data analysis; describe how machine learning algorithms are used in data analysis (Note: Specific algorithms are out of scope)

4. Data Visualization and Communication

    4.1 Report data
    • Effectively display information in tables and charts; explain when and why to disaggregate data
    4.2 Create visualizations from data
    • Identify data visualization practices that minimize the potential for misinterpretation; identify visualization types that represent the underlying data structure and analysis questions (including comparison, time/trend, part-to-whole, relationship, distribution, correlation graphs, box and whisker diagram, scatter chart, scatter plot, bar chart, Sankey diagram, histogram, pie chart, column chart, etc.)
    4.3 Derive conclusions from a data visualization
    • Translate a visual representation of data into words; identify differences between claims based on an analysis and its graphical representation

5. Responsible Analytics Practices

    5.1 Describe data privacy laws and best practices
    • GDPR, FERPA, HIPAA, IRB, PCI, etc
    5.2 Describe best practices for responsible data handling
    • Methods of handling PII, securing data, and protecting anonymity within small data sets; importance of anonymizing data; trade-offs when balancing interpretability and accuracy; shortcomings of making population-level generalizations with limited sample data
    5.3 Given a scenario, describe types of bias that affect collection and interpretation of data
    • Confirmation bias, human cognitive bias, motivational bias, sampling bias; selecting visualizations/data representations to avoid bias
C Programming Course in Nagpur