Analyzing Employee Satisfaction in Major Consulting Firms from Glassdoor Reviews — Part 2 (Calculating Lift Scores)

6 min readJan 30, 2022

Team Members: Lucy Hwang, Rhiannon Pytlak, Hyeon Gu Kim, Mario Gonzalez, Namit Agrawal, Sophia Scott, Sungho Park

I n last blog, we successfully scraped the reviews from Glassdoor using Selenium, preprocessed and tokenized words, lemmatized words and created wordclouds using NLTK and WordCloud libraries. Here are the list of steps we took for this project and we are on the fourth step where we calculate lift scores!

Scrape consulting firms reviews from Glassdoor using Selenium
Preprocess and lemmatize data
Create word clouds
Calculate lift scores (We’re here)
Topic Modeling & Latent Dirichlet Allocation (LDA)
Cosine Similarity
Sentiment Analysis

What is lift score? Lift score simply represents association between two words or phrases. Lift score answers why terms or words appear together in message by chance or due to real association. For example, if A = Volvo and B = safety, the lift score is as follows:

If the lift score is equal to one, it means the words are appearing randomly. If the lift score is greater than one, it means there’s “lift” or association between the two words. If the lift score is less than one, it means there’s no association between the two words and you are expected to not see another word when you see the one of the two words.

The higher lift score, the more association between A and B

The formula of lift score can also be like this:

The “#(A)” means number of messages that contain A and “N” represents the data size (number of entire messages).

In addition, it is worth noting that lift score is different from confidence. Confidence is basically the probability of people who mentioned word A also mentioned B. For example, let A = “BMW” and B = “Lexus”. Confidence(Lexus | BMW) = #(Lexus, BMW) / #(BMW). Confidence has some drawbacks. Confidence is not symmetric. That means #(Lexus, BMW) / #(BMW) and #(Lexus, BMW) / #(Lexus) are not same. And if most people talk about Lexus anyway, Confidence(Lexus|BMW) will not be a useful metric. Compared to confidence, lift score is more reliable method. Lift score is same as Confidence(A, B) / Support(B), where Support simply means #(B)/N.

However, lift score also has drawback. One major drawback of lift score is that it cannot see context of a sentence. So sarcasm like “The Green Lantern movie was so good!” or “I feel not safe in Volvo” would have high lift score between “Green Lantern movie” and “good”, and “safe” and “Volvo”, even though we all know those are not the case. In order to interpret the entire context of a sentence, we need to implement something called sentimental analysis which we will take a look on step 7.

Now that we have understanding of lift score, let’s see how we utilized the method for analyzing employee satisfaction in their companies!

Just in case you forgot about the last step, here is the dataframe resulted from the last step (scraping, preprocessing, and lemmatization):

Notice the columns “pros_replace” and “cons_replace” are the columns we added in the last step. We mainly used those two columns to calculate lift scores.

First, we manually chose some words from the top frequent pro and cons words (below image). We also used the main five attributes of a company — work life balance, culture value, career opportunity, company benefit, and senior management. Now the pros & cons attributes are denoted as A and the main five company attributes are denoted as B.

Now that we have A, B and the data (pros_replaced & cons_replaced columns), we are ready to compute lift score. Using the lift formula above (Lift formula 2 image), we basically iterate and calculated lift scores of every combination of pros & cons attributes (A) and the main five company attributes (B) in each every sentences. For example, say A is “great” and B is “culture value”. The “get_lift” function iterates through each tokenized word list (the elements of pros_replaced & cons_replaced columns) and see if each word list have words A or B. If A word appears in a word list, it puts the word list into a list and finds the output list’s length. The same procedure goes through with p(B) part and p(A,B) part. Then it simply calculates lift score using the formula.

Because it is nested for loop, the total iteration was 5 * 30 (length of the five company attribute and length of the pros attribute words) which is 150 iteration. In each iteration, it also iterates every messages which is about 2,000 messages (4,000 including cons). This is why we only chose comparably small amount of pros & cons candidate words because we were afraid of the computational time. We calculated the lift scores with the code below:

We then created a dataframe of the resulted lift scores. Below is the first 9 rows of the dataframe:

We can observe that word “great” is highly associated with culture value, career opportunity, company benefits and senior management. We can also see that company benefit appears with words “happy” and “decent” very frequently since they have high lift scores (2.48 for (happy, company benefit) and (decent, company benefit)). From here, we can assume that many Glassdoor reviewers (or employees) thought positively about the company benefits of PwC (the company that we are currently using as an example in this blog). On the other hand, word “ethical” is not associated with company benefit (zero lift score), which makes sense because those two words usually don’t go hand in hand.

Furthermore, we also tried the same procedure to each company’s core values. So in PwC’s case, its core values (‘trust’, ‘solve’, ’integrity’, ’innovation’, ’possibilities’, ‘improve’, “difference”, “care”, “together”) are treated as A and the five company attributes are treated as B. Below is the resulted dataframe of lift scores:

Lift scores between the company’s core values and the five company attributes

Word “trust” really frequently appeared when work life balance was mentioned (lift score 5.37), as well as word “together” (lift score 7.16). However, there are many zeros in the resulted dataframe. Why? There could be multiple reasons for that.

First, there should be more words that defines each of the five company attributes. That is, there are much more words that represent work life balance, for instance, and we can’t simply take every single words that can represent work life balance. Even if a message has words that represent work life balance, its lift score will be zero if those words are not in the dictionary of words that we chose.

Secondly, a company’s core value word can be mistakenly thought as the words that has to be treat or filtered out during preprocessing/lemmatization stage. If we removed/lemmatized one of a company’s core value words, how would we able to compute lift score of that word?

Lastly, it could be just natural thing — people just not use those words hand in hand. For instance, people normally don’t use word “possibilities” when mentioning about work life balance, culture value and company benefits. We can see that the lift scores of word “possibilities” reflects such phenomenon (high association between word “possibilities” with career opportunity and senior management makes sense).

In next part, we will focus on Topic modeling and LDA!

Analyzing Employee Satisfaction in Major Consulting Firms from Glassdoor Reviews — Part 2 (Calculating Lift Scores)

Click here for Part 3 (Topic Modeling & LDA) of the project!

Written by Hyeon Gu Kim