Data Science

Overview

Data Science — MLIS Career Pathway

Data science is the art and science of collecting, organizing, processing, analyzing, archiving, preserving, and providing access to massive amounts of data in order to extract meaningful information.

As noted by LIS professional Amy Affelt in her book The Accidental Data Scientist: Big Data Applications and Opportunities for Librarians and Information Professionals (Information Today, 2015), librarians have always worked with reasonably large amounts of data (circulation desk metrics, budgeting and strategic planning, GIS demographic mapping, etc.), but most often in relational databases and similar familiar formats. The difference is that today’s volume of data generation is so massive (i.e., big data) that it can only be managed and made sense of through complex computer-based algorithms.

This is the data challenge now facing businesses, government agencies, educational institutions, healthcare organizations, and social media platforms, among others – how do they make sense of all these data points in order to make smarter decisions?

Employment Opportunities

To understand the range of data-related employment opportunities, it’s useful to first consider the data lifecycle, or all the points in gathering, managing, using, and storing data where a data specialist might be involved. For example, a typical data lifecycle might involve the following steps:

  • Collecting the raw data, which includes identifying the relevant/best data sources and vetting their credibility and appropriateness for the task or question at hand.
  • Processing the data, i.e., bringing them into a data management system(s) in a uniform format and structure.
  • Cleaning the data, which includes “scrubbing” the data of items such as duplications so reliable datasets can be created.
  • Creating and applying algorithms or data queries that will surface data relevant to the question or issue being considered (often called “data mining”).
  • Analyzing the datasets, or results of the models and algorithms that have been run, in order to identify meaningful, actionable patterns of information – in other words, what is this data telling us about the question we are asking? What projections about the future can we make based on these historical patterns?
  • Communicating the outcomes, often through some type of data visualization or a dashboard format, in a manner that lets key stakeholders easily understand and base decisions on the findings.
  • Preserving the data, creating and maintaining systems that provide for the preservation, access, retrieval, and potential re-use of the relevant data and datasets for future reference.

Students interested in data science and data librarianship might find work in any of the activities identified above, depending on where in the data lifecycle process they want to specialize. Based on that range of options, in addition to the very broadly defined “data librarian,” a representative list of data-focused jobs include (among many others):

  • Big data engineer
  • Business intelligence analyst
  • Customer data analyst
  • Data acquisitions specialist
  • Data analytics manager
  • Data analyst/report writer
  • Data and information specialist
  • Data architect
  • Data archivist
  • Data asset manager
  • Data curator
  • Data metadata specialist
  • Data modeler
  • Data quality manager
  • Data services librarian
  • Data visualization specialist
  • Data warehouse manager
  • Database developer
  • Governance data quality steward
  • Research data librarian
  • Scientific data manager

MLIS Skills at Work Report

The MLIS Skills at Work includes important trends and data that are needed to prepare for career advancement within the information professions. The following information within the report relates directly to the data science career path. However, slides #12, #13, and #14 showcase/highlight the skills most valuable to employers.

  • See the MLIS Skills at Work report, slides #5 through #8 for more detailed information about hiring trends and slide #21 for representative job titles
  • See slide #26 to view sample job titles, job duties, job skills, and technology/standards for data management and analysis
  • See also slides #25 (Collection, Acquisition and Circulation), #24 (Cataloging and Metadata), #32 (Reference and Research), and #31 (Outreach, Programming and Instruction) for additional roles within this career pathway

Core Theory and Knowledge

The core theory and knowledge of data science is structured around the key activities encompassed in the data lifecycle and the processes used to derive meaning from that data. In general, core knowledge areas include:

  • The goals and uses of data science for decision-making, predictive modeling, and similar business cases
  • The data lifecycle and each phase within that lifecycle
  • Data science systems and technologies and how to apply them in real-life settings
  • The processes by which disparate sources of data can be gathered, formatted for uniformity, and made searchable so as to provide meaningful information and insights
  • The primary tools of data management and manipulation such as Hadoop, Splunk, Sumo Logic, and Spark and how to choose the best solution for specific data challenges
  • Understanding and being able to apply fundamental data science activities such as data mining, data analytics, data querying, and data visualization in order to elicit and present actionable business or organization insights
  • The diverse range of existing and emerging data sources such social network interactions, personal “wearable” health/activity monitors, shopping patterns, and similar emerging applications

MLIS Requirements

The MLIS program requires 43 units for graduation. Within those units, six courses (16 units) are required of all MLIS students and must be taken as part of all career pathways: INFO 203, INFO 200, INFO 202, INFO 204, INFO 285, and either INFO 289 or INFO 299. Beyond those six courses, a student is free to select electives reflecting individual interests and aspirations.

If you are interested in this career pathway, you may choose to select from the foundation or recommended course electives listed below. Foundation courses form the foundational knowledge and skills for this pathway. If you can only select a few electives, then choose from the foundation courses.  See also the recommended courses in the Areas of Emphasis section below.

The career pathway described here is provided solely for advising purposes. No special designation appears on your transcript or diploma. All graduating students receive an MLIS degree.

Recommended Coursework

Required Courses

Foundation Courses

Effective leadership and management (of people and information) is critically important for all types of work environments and clients.  We recommend that students also consider selecting courses from the Leadership and Management career path to complement or supplement core skills in other areas.

Areas of Emphasis within the Data Science Pathway

While all students earn an MLIS degree from the iSchool (no special designation appears on academic transcripts or diplomas), students may include Area of Emphasis information about their skill sets on resumes and in cover letters. The iSchool faculty (with input from the Knowledge Organization Program Advisory Committee) developed the recommended courses below for these Areas of Emphasis.

Data & Records Management

This area of emphasis focuses on data analytics, data curation, data management, data preservation, data processing, data querying/mining, data solutions, data sources, discerning meaning from data, and tools and systems.

Data Analytics & Communication

This area of emphasis focuses on data analytics, data communication, data querying/mining, data solutions, data visualization, discerning meaning from data, representing meaning in data, tools and systems.

Note: See also INFM 210 Health Informatics.

 

Faculty pathway advisors are available to help guide you and answer questions about planning a career in their area of expertise.

Learn More

For an excellent introduction to the field, check out Amy Affelt’s Accidental Data Scientist: Big Data Applications and Opportunities for Librarians and Information Professionals (Information Today, 2015), available through King Library as an ebook.

For checking out professionals engaged in data-related information work, consider joining the iSchool student chapters of ASIS&T and/or SLA and exploring the associations’ special interest groups, for example SLA’s Data Caucus.

In addition, you may also want to check out the following resources.

[top]