Overview
This unit focuses on the foundational concepts of data science. Digital data is growing at a very fast rate with data being the underlying driver of the knowledge economy. This unit will prepare you with foundational knowledge and practical skills about data collection, representation, storage, retrieval, management, analysis, and visualisation through the exploration of data-related challenges. You will also learn the impact of big data and business analytics on business performance to cater for the development of useful information and knowledge in an attempt to achieve data-driven decision making.
Details
Pre-requisites or Co-requisites
Prerequisite: COIT11226 Systems Analysis Co-requisite: COIT11237 Database Design & Implementation
Important note: Students enrolled in a subsequent unit who failed their pre-requisite unit, should drop the subsequent unit before the census date or within 10 working days of Fail grade notification. Students who do not drop the unit in this timeframe cannot later drop the unit without academic and financial liability. See details in the Assessment Policy and Procedure (Higher Education Coursework).
Offerings For Term 2 - 2018
Attendance Requirements
All on-campus students are expected to attend scheduled classes – in some units, these classes are identified as a mandatory (pass/fail) component and attendance is compulsory. International students, on a student visa, must maintain a full time study load and meet both attendance and academic progress requirements in each study period (satisfactory attendance for International students is defined as maintaining at least an 80% attendance record).
Recommended Student Time Commitment
Each 6-credit Undergraduate unit at CQUniversity requires an overall time commitment of an average of 12.5 hours of study per week, making a total of 150 hours for the unit.
Class Timetable
Assessment Overview
Assessment Grading
This is a graded unit: your overall grade will be calculated from the marks or grades for each assessment task, based on the relative weightings shown in the table above. You must obtain an overall mark for the unit of at least 50%, or an overall grade of ‘pass’ in order to pass the unit. If any ‘pass/fail’ tasks are shown in the table above they must also be completed successfully (‘pass’ grade). You must also meet any minimum mark requirements specified for a particular assessment task, as detailed in the ‘assessment task’ section (note that in some instances, the minimum mark for a task may be greater than 50%). Consult the University’s Grades and Results Policy for more details of interim results and final grades.
All University policies are available on the CQUniversity Policy site.
You may wish to view these policies:
- Grades and Results Policy
- Assessment Policy and Procedure (Higher Education Coursework)
- Review of Grade Procedure
- Student Academic Integrity Policy and Procedure
- Monitoring Academic Progress (MAP) Policy and Procedure – Domestic Students
- Monitoring Academic Progress (MAP) Policy and Procedure – International Students
- Student Refund and Credit Balance Policy and Procedure
- Student Feedback – Compliments and Complaints Policy and Procedure
- Information and Communications Technology Acceptable Use Policy and Procedure
This list is not an exhaustive list of all University policies. The full list of University policies are available on the CQUniversity Policy site.
- Discuss and demonstrate data science foundational concepts
- Investigate and evaluate applications for data storage, management, retrieval, and analysis and visualisation
- Apply knowledge to process data for data driven decision making
- Analyse and generate solutions to solve data-related challenges
- Demonstrate the knowledge required in using data science skills to solve business problems.
Australian Computer Society (ACS) recognises the Skills Framework for the Information Age (SFIA). SFIA is in use in over 100 countries and provides a widely used and consistent definition of ICT skills. SFIA is increasingly being used when developing job descriptions and role profiles.
ACS members can use the tool MySFIA to build a skills profile at https://www.acs.org.au/professionalrecognition/mysfia-b2c.html
This unit contributes to the following workplace skills as defined by SFIA. The SFIA code is included:
Data Management (DATM)
Business Analysis (BUAN)
Data Analysis (DTAN)
IT Operation (ITOP)
Alignment of Assessment Tasks to Learning Outcomes
Assessment Tasks | Learning Outcomes | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
1 - Practical Assessment - 40% | |||||
2 - Written Assessment - 40% | |||||
3 - Presentation - 20% |
Alignment of Graduate Attributes to Learning Outcomes
Graduate Attributes | Learning Outcomes | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
1 - Communication | |||||
2 - Problem Solving | |||||
3 - Critical Thinking | |||||
4 - Information Literacy | |||||
5 - Team Work | |||||
6 - Information Technology Competence | |||||
7 - Cross Cultural Competence | |||||
8 - Ethical practice | |||||
9 - Social Innovation | |||||
10 - Aboriginal and Torres Strait Islander Cultures |
Alignment of Assessment Tasks to Graduate Attributes
Assessment Tasks | Graduate Attributes | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
1 - Practical Assessment - 40% | ||||||||||
2 - Written Assessment - 40% | ||||||||||
3 - Presentation - 20% |
Textbooks
Data Science
Edition: 2013 (2013)
Authors: Jeffrey Stanton
Creative Commons Attribution- NonCommercial-ShareAlike 3.0 license
Binding: eBook
Data Science for Business What You Need to Know about Data Mining and Data-Analytic Thinking
Edition: latest (2013)
Authors: Foster Provost and Tom Fawcett
O'Reilly Media
Binding: eBook
Practical Data Science with Hadoop and Spark: Designing and Building Effective Analytics at Scale
(2016)
Authors: Mendelevitch, O, Stella, C & Eadline, D
Pearson
Upper Saddle River Upper Saddle River , NJ , USA
ISBN: 9780134024141
Binding: Paperback
R Programming for Data Science
Edition: Latest (2015)
Authors: Roger D Peng
Leanpub
Binding: eBook
Additional Textbook Information
INTRODUCTION TO DATA SCIENCE A PDF version of this book and code examples used in the book are available at: http://jsresearch.net/wiki/projects/teachdatascie
IT Resources
- CQUniversity Student Email
- Internet
- Unit Website (Moodle)
- R statistical program available as a free download online (download R studio as well - also free). Compatible on Mac and PC.
- R Studio and R
All submissions for this unit must use the referencing style: Harvard (author-date)
For further information, see the Assessment Tasks.
m.jha@cqu.edu.au
Module/Topic
Introduction to Data Science: What is data science; data domination; innovation from internet giants; data science history; data science in modern enterprises; soft skills of a data scientist; data science project life cycle; types of data; big data; how is big data different.
Chapter
Chapter 1
Practical Data Science with Hadoop and Spark Designing and Building Effective Analytics at Scale Authors: O Mendelevitch; C Stella, and D Eadline.
Chapter 1
Data Science Author: J Stanton
Events and Submissions/Topic
Module/Topic
Identifying Data Problems: From business problems to data mining tasks; data mining tasks; data collection; business use cases; sampling; data mining process.
Chapter
Chapter 2
Practical Data Science with Hadoop and Spark Designing and Building Effective Analytics at Scale Authors: O Mendelevitch; C Stella, and D Eadline.
Chapter 2
Data Science Author: J Stanton
Events and Submissions/Topic
Module/Topic
Hadoop and Data Science: Storage requirements; what is Hadoop; Hadoop's evolution; Hadoop tools for data science; spark, R for data science; R package.
Chapter
Chapter 3
Practical Data Science with Hadoop and Spark Designing and Building Effective Analytics at Scale Authors: O Mendelevitch; C Stella, and D Eadline. Chapter 3 Chapter 3, 4 and 8
Data Science Author: J Stanton
Events and Submissions/Topic
Module/Topic
Data Presentation: Understand different ways of summarizing data; choose the right table/ graph for the right data and audience; self explanatory graphics; attractive graphs and tables.
Chapter
CRO on Moodle
Chapter 5, 6 and 9
Data Science Author: J Stanton
Events and Submissions/Topic
Module/Topic
Data Dissemination and Use: The purpose of dissemination; dissemination issues and concerns; strengths and weaknesses of different communication formats; components of dissemination plans.
Chapter
Chapter 7
Data Science Author: J Stanton and supplementary readings
Events and Submissions/Topic
Module/Topic
Revise all previous lecture slides and tutorial work
Chapter
Events and Submissions/Topic
Module/Topic
Data Retrieval: What is information retrieval; information retrieval (IR) vs question; ideal information retrieval; relevant answers; how is IR accomplished; information acquisition process; what is search; how IR systems work; search engines; structure of an IR system.
Chapter
Chapter 10
Data Science Author: J Stanton and supplementary readings
Events and Submissions/Topic
Practical Assessment due
Practical Assessment Due: Week 6 Friday (24 Aug 2018) 11:45 pm AEST
Module/Topic
Data Analytics: Why analytics; different types of analytics; delivery methods for the operational users; holistic approach to expand enterprise analytics; value of integration and data quality to analytics.
Chapter
Chapter 11, 12
Data Science Author: J Stanton and supplementary readings
Events and Submissions/Topic
Module/Topic
Data Discovery and Data Mining: Data driven decisions; enabling data driven innovations; knowledge discovery process; data cleaning; data integration; data selection; data transformation; knowledge based systems; data mining and its goals;data mining operation and process.
Chapter
Chapter 13, and 14
Data Science Author: J Stanton and supplementary readings
Events and Submissions/Topic
Module/Topic
Semantic Analysis: What is semantic analysis; context sensitive analysis; approaches to semantic analysis; applications of semantic analysis; comparison with artificial intelligence; strategies for semantic analysis; symbol table and type checking.
Chapter
Chapter 15, and 16
Data Science Author: J Stanton and supplementary readings
Events and Submissions/Topic
Module/Topic
Data Security and Privacy: protection of personal data; data collection and significant risks; challenges of big data for data protection; confidentiality; integrity; availability; middleware security concerns; built in database protection; privacy issues; data security and storage; identification and authentication.
Chapter
Chapter 17
Data Science Author: J Stanton and supplementary readings
Events and Submissions/Topic
Module/Topic
Data Integration: Analytic data integration; challenges in data integration; technologies in data integration; data mapping; data staging; data extraction; data transformation; data loading; need for integration; data integration approaches.
Chapter
Chapter 18
Data Science Author: J Stanton and supplementary readings
Events and Submissions/Topic
Written Assessment due
Written Assessment Due: Week 11 Friday (28 Sept 2018) 12:00 am AEST
Module/Topic
Cloud Computing for Data Processing: What is cloud; what is cloud computing; deployment models of cloud; advantages of cloud; characteristics of cloud
Chapter
CRO provided on Moodle and supplementary readings.
Events and Submissions/Topic
Presentation due
Presentation Due: Week 12 Monday (1 Oct 2018) 11:45 pm AEST
Module/Topic
No Final Exam for this Unit
Chapter
Events and Submissions/Topic
Module/Topic
Chapter
Events and Submissions/Topic
Contact information for Dr Meena Jha: Email: m.jha@cqu.edu.au Office: Level 2, 400 Kent Street, Sydney Campus; P +61 2 9324 5776 | X 55776. Please submit questions about the unit through the 'Q&A' discussion forum in Moodle - that way, everyone can benefit from the questions and answers. If you have any individual queries, please email me and I'll try to get back to you within a day or so.
1 Practical Assessment
This assessment is designed to reinforce the content taught in week 1-5. This assessment relates to learning outcomes 1 and 2. This assessment is individual assessment and should be submitted in week 6.You will submit work on R language on data processing exercises. This will provide you with an opportunity to learn data storage, processing using R language. Each week you will be presented with a data-related challenge, and will use computer tools to manipulate data to solve that challenge. This task will help to build your knowledge of data formats, and retrieval and analysis techniques. R language questions detail will be provided to you through Moodle in Week 1. This assessment contributes to 40% of the total marks.
Week 6 Friday (24 Aug 2018) 11:45 pm AEST
All files must be submitted on Moodle.
Week 8 Friday (7 Sept 2018)
Markers will do their best to return feedback on assignment 1 on due date.
You will be given 20 questions to answer and each question is worth 2 marks. The total weight is 40% of this assessment.
Assessment 1 will be marked based on the following criteria.
Working r source Code provided 1.5 * 20 = 30 marks
Submitted screen shot of all 20 questions 0.5 * 20=10 marks
Total marks 40
- Discuss and demonstrate data science foundational concepts
- Investigate and evaluate applications for data storage, management, retrieval, and analysis and visualisation
- Communication
- Critical Thinking
- Information Literacy
- Information Technology Competence
2 Written Assessment
This assessment is based on case study. You are required to write a report of 2000 words. This is an individual report and contributes to Learning Outcome 2, 3 4 and 5. This individual report will follow a standard business report format. You will be investigating how you might advise an organisation whose details are given in a case study on data storage, retrieval, and analysis mechanisms. You will submit your assignment to Moodle. The assignment will be marked out of a total of 100 marks and forms 40% of the total assessment for the unit.
Week 11 Friday (28 Sept 2018) 12:00 am AEST
Exam Week Monday (15 Oct 2018)
This assessment feedback will be released after certification date as this unit does not have exam.
Marking Criteria for assessment 2
Report formatting (font, header and footer, table of content, numbering, referencing) 5
Professional communication (correct spelling, grammar, formal business language used) 5
Executive summary 10
Report introduction 10
Data Collection and Storage 20
Data in Action 30
Business continuity 10
Conclusion and Recommendations 10
Total = 100
- Investigate and evaluate applications for data storage, management, retrieval, and analysis and visualisation
- Apply knowledge to process data for data driven decision making
- Analyse and generate solutions to solve data-related challenges
- Demonstrate the knowledge required in using data science skills to solve business problems.
- Communication
- Problem Solving
- Critical Thinking
- Information Literacy
- Information Technology Competence
- Ethical practice
- Social Innovation
3 Presentation
It is an individual presentation. All students are required to give presentation. You are required to present the recommendations and the outcome of the reports you have developed for the case study and written assessment. This assessment contributes to the learning Outcome 3, 4 and 5.
For DISTANCE Students Only: Distance students will have zoom presentation. The unit Coordinator will conduct this presentation.
Week 12 Monday (1 Oct 2018) 11:45 pm AEST
Exam Week Monday (15 Oct 2018)
The feedback will be released after the certification date as this unit does not have exam.
Marking Criteria for presentation:
Stay on topic 5 marks
Fulfill requirements of topic 5 marks
Slide Style 5 marks
Presentation Style 5 marks
Total 20
- Apply knowledge to process data for data driven decision making
- Analyse and generate solutions to solve data-related challenges
- Demonstrate the knowledge required in using data science skills to solve business problems.
- Communication
- Problem Solving
- Information Literacy
As a CQUniversity student you are expected to act honestly in all aspects of your academic work.
Any assessable work undertaken or submitted for review or assessment must be your own work. Assessable work is any type of work you do to meet the assessment requirements in the unit, including draft work submitted for review and feedback and final work to be assessed.
When you use the ideas, words or data of others in your assessment, you must thoroughly and clearly acknowledge the source of this information by using the correct referencing style for your unit. Using others’ work without proper acknowledgement may be considered a form of intellectual dishonesty.
Participating honestly, respectfully, responsibly, and fairly in your university study ensures the CQUniversity qualification you earn will be valued as a true indication of your individual academic achievement and will continue to receive the respect and recognition it deserves.
As a student, you are responsible for reading and following CQUniversity’s policies, including the Student Academic Integrity Policy and Procedure. This policy sets out CQUniversity’s expectations of you to act with integrity, examples of academic integrity breaches to avoid, the processes used to address alleged breaches of academic integrity, and potential penalties.
What is a breach of academic integrity?
A breach of academic integrity includes but is not limited to plagiarism, self-plagiarism, collusion, cheating, contract cheating, and academic misconduct. The Student Academic Integrity Policy and Procedure defines what these terms mean and gives examples.
Why is academic integrity important?
A breach of academic integrity may result in one or more penalties, including suspension or even expulsion from the University. It can also have negative implications for student visas and future enrolment at CQUniversity or elsewhere. Students who engage in contract cheating also risk being blackmailed by contract cheating services.
Where can I get assistance?
For academic advice and guidance, the Academic Learning Centre (ALC) can support you in becoming confident in completing assessments with integrity and of high standard.