HR Analytics – Big Data

Big Data - HR Data Analytics

HR Data Analytics

What is HR Analytics?

‘HR analytics consists of a number of processes, enabled by technology, that use descriptive, visual and statistical methods to interpret people data and HR processes. These analytical processes are related to key ideas such as human capital, HR systems and processes, organisational performance, and also consider external benchmarking data.’MARLER AND BOUDREAU 2017

This is one of a few definitions for a practice that goes by many names;

  • HR Analytics
  • People Analytics
  • Work force Analytics

A  simple explanation of HR analytics would be quantifying how the workforce impacts the business.  For example, Lowes, a US home improvement retail chain, used HR Analytics to establish a link between HR processes, employee engagement, and store performance.

The idea has been around since 2004,  but there are few evidence based reviews on the practice.  It has been termed a fad by some people, and an opportunity yet to be fully realised by others.

The practice depends on ‘Big Data’.    Big Data is, ideally, characterised by the “5V’s”;

  1. Volume – a large amount of data
  2. Variety – covers many aspects
  3. Velocity – builds up fast
  4. Value –  relevant data
  5. Veracity – accurate data

What data is needed?

Ensuring the appropriate data is collected in a timely and accurate way, is the only way to ensure that meaningful analytics result.

Unfortunately the quality of data that goes into many HR systems is poor and there are many gaps.

What data is collected?

Most HRIS collect  high level information on workers who are hired or considered for hire

  • employment history
  • skills and competencies
  • formal educational qualifications
  • demographic information

Once a worker is employed by a firm,  the typical data collected is

  •  hours worked
  •  pay and incentive compensation
  •  leave
  •  performance outcomes – for some workers  (for example sales made, hours  billed to clients)
  •  performance ratings (subjective)
  •  competency ratings
  •  subjective ratings of potential to advance
  •  capability gaps/training needs
  •  training courses completed
  •  performance issues, grievances, disciplinary cases
  •  disputes and  resolutions
  •  anonymous staff attitude data
  •  exit interview data

HRIS systems tend to be very standardised – they typically don’t have the level of detail required for good analysis. Much important information may be unstructured – that is stored in documents and spreadsheets.

Data is fragmented.  HR data is usually held in separate pieces of software designed to carry out different HR processes (Parry, 2011), but for HR Analytics they need to be consolidated in data warehouses.

What data should be collected?

data_neededData needs to relate to things that are important for the operations of the organization.  It must cover factors that the organization can do something about.   For example current and future staff competency needs, factors that affect staff engagement and retention.  Factors that affect productivity.

It must be unambiguous,  and not suffer from omissions nor be conflated with extraneous information.

For example,  job knowledge is the single best predictor of successful performance.  Therefore  organizations should have a database of required versus actual job and task specific technical competencies,  core and leadership competencies.

There should be objective data of job performance.  It IS possible to measure job output for individuals – but incredibly this is not something that most organizations do.

With a shortage of skilled workers staff retention is a priority. Organizations need to collect quality information on career interest and plans.  Information on the work environment and how it is perceived by staff is needed to identify and fix issues that adversely affect staff engagement.

Staff attitude surveys could provide critical data.  However they are often conducted by third parties and usually anonymous.   Identified attitude surveys would fill this gap, but few organizations are using them.

Who should be responsible for it?

There are 2 key aspects –

  1. Design and deployment of systems to collect the required data
  2. Analysis of the data.

Data analysis itself varies from simple calculations to advanced multivariate models.  Many HR professionals are not currently confident at performing the higher end analysis, nor do they have in depth knowledge of IT systems and their potential.

For HR data to be useful it must be able to predict key aspects of human performance and behaviour.  For example the characteristics of staff likely to be high turnover risks.

Predictive HR Analytics“Predictive Analytics is nothing else, but assuming that the same thing will happen in the future, that happened in the past.”

Istvan Nagy-Racz

So in this case the historical data is staff data,  demographics, performance data, career interest, development and competency data. If available attitude surveys indicating engagement levels, intentions to leave, and exit interviews.

Then there has to be a theory to test – of the factors likely to predict turnover, and the method of calculation – the predictive model.

Since this is just a theory it needs to be tested using subsets of the historical data. The model is ‘trained’ – tweaked until the model fits as closely as possible the historical facts of the staff that left.  Then that model can be used to predict which existing staff are likely to be turnover risks in the future.

Of course this all rests on the assumption that the future will follow the same pattern as the past – and this may not always be the case.

In any event IT and finance people are more likely to be competent in both systems and analytics.

The down side –

“in the absence of analytical skills HR may cede people analytics to the IT and finance functions, where relevant data management and analytics capabilities tend to exist. In doing this, HR loses the opportunity to influence strategically on workforce issues, and is also removed from discussions relating to work and the workforce”

Marler and Boudreau 2017

Opportunities and limitations of HR Analytics


Data is already collected by HR, but it isn’t until it is analysed that it can become a meaningful tool to help solve, predict or explain business problems.

Business performance and productivity along with workforce risk management are key interrelated areas where HR analytics are potentially useful.

There is a positive relationship between a strong HR analytics culture and business performance.   The opportunity is that the deeper understanding of its people allows a more proactive approach to business performance management.

Workforce risk management using HR analytics can quantify such issues as talent retention, health and safety,  and employee relations.

Ideally the value of HR data is in using it to answer strategic questions about how people create value for the organisation, so that value can be sustained and developed.


The analytics modules of HRIS software packages typically do not have the capacity to perform this sort of analysis, which typically requires advanced statistical modelling.

bad dataData sets just do not have the relevant data or even accurate or complete sets of data.

Running analysis on the dirty data that exists is likely to produce correlations – but there is a good chance that many of these will be either spurious correlations – or of such small effect size as to be irrelevant.

To establish a cause and effect relationship requires ‘repeated observations’ ,  systematically recording multiple instances of the same data along with parallel recording of other data that may have an influence.

‘Experimental designs’,  a structured approach to implementation of new initiatives, need to be used to establish cause and effect relationships.  A  ‘multiple baseline’ design means recording data for a whole group of variables over a period of time – to establish a baseline. Then changing just one aspect to see whether or not that makes a difference.

For example to analyse whether hours of work affect absenteeism an organisation would collect data on its workforce demographics and hours of work across all locations.  Whilst continuing with these measures hours of work are changed in just one location for a period, and then reverted to the original.   Then the change is made in another location.   By continuing the measures in all locations the effect of other common factors such as season can be excluded.   By using more than one location factors that are location specific can be excluded. If there is a cause and effect relationship we would expect to see a change only when the hours are changed – and for the effect to disappear when reverted to normal.

HR systems can easily be configured to collect this type of data, although HR Analytics systems are not typically set up for this kind of analysis.


Is HR Analytics just a fad – what of the ROI?

It is evident that there is a great deal of effort needed to collect the right data and then apply the appropriate analytical methods.

Sparrow et al. (2015) cite the example of how Tesco applied analytics tools  to understand its customers and then to better understand its workforce. They describe how McDonalds was able to identify how staff demographics, management behaviours and employee attitudes interacted to optimise restaurant performance.     However in the case of one retailer who established a link between employee engagement and store profitability this was a 3 year investment.

Even large multinational organisations that have made significant investments in HR analytics, and considerable progress in embedding analytics in other areas of their businesses,  report that their HR analytics programmes have not progressed beyond the reporting of basic historical HR information.

roiIn a small 2016 survey of small to multi national organisations it was found that Excel remained the primary tool for HR Analytics. Data quality, variety and quantity was a major barrier to implementing HR Analytics.  Thus the survey report concluded that HR Analytics is an initiative with “Big Promise – Small Reality”.

The question remains unanswered as to whether the significant investment in time and resources required to set up a quality HR Analytics programme will provide a positive financial return.


MARLER, J.H. and BOUDREAU, J.W. (2017) An evidence-based review of HR analytics. International Journal of Human Resource Management. Vol 28, No 1. pp3–26.

CIPD and Workday June 2018   People analytics:driving business performance with people data. Global Research.

Charlwood et al (2016) HR and analytics: why HR is set to fail the big data challenge. Human Resource Management Journal .

Parry, E. (2011). ‘An examination of e-HRM as a means to increase the value of the HR function’. The International Journal of Human Resource Management, 22: 5, 1146–1162.

Sparrow, P., Hird, M. and Cooper, C. (2015). Do We Need HR? Repositioning People Management for Success, Basingstoke: Palgrave Macmillan