The looming AI bias in hiring and staffing decision-making

Remember when people worked at jobs for most of their lives? It was general practice back in the 1950s and 1960s. My dad worked for the same employer for 30 or so years. I recall his concern when I changed jobs after two years out of grad school, warning me that it wouldn’t bode well for my future prospects.

So here I am, ironically now 30-plus years into working for my own business. But this high-frequency job hopping has also accelerated the number of resumes that flood a hiring manager, which in turn has motivated many vendors to jump on board various automated tools to screen them. You might not have heard of companies in this space such as HireVue, APTMetrics, Curious Thing, Gloat, Visier, Eightfold and Pymetrics.

Add two things to this trend. First is the rise in quiet quitting, or employees who just put in the minimum to their jobs. The concept is old, but the increase is significant. Second and the bigger problem is another irony: now we have a very active HR market segment that is fueled by AI-based algorithms. The combination is both frustrating and toxic, as I learned from reading a new book entitled The Algorithm, How AI Decides Who Gets Hired, Monitored, Promoted, and Fired and Why We Need to Fight Back Now.Hilke Schellmann It should be on your reading list. It is by Hilke Schellmann, a journalism professor at NYU, and it examines the trouble with using AI to make hiring and other staffing decisions. Schellmann takes a deep dive into understanding the four core technologies that are now being deployed by HR departments around the world to screen and recommend potential new job candidates, along with other AI-based tools that come into play to evaluate employees performance and try to inform other judgments as to raises, promotions, or firing. It is a fascinating look at this industry, fascinating and scary too.

Thanks to digital tools such as LinkedIn, Glassdoor and the like, sending in your resume to apply for an opening has never been easier. Just a few clicks and your resume is sent electronically to a hiring manager. Or so you thought. Nowadays, AI is used to automate the process: These are automated resume screeners, automated social media content analyzers, gamified qualification assessments, and one-way video recordings that are analyzed by facial and tone-of-voice AIs. All of them have issues, aren’t completely understood by both employers and prospects alike, have spurious assumptions and can’t always quantify the important aspects of a potential recruit that would ensure success at a future job.

What drew me into this book was that Schellmann does plenty of hands-on testing of the various AI services, using herself as a potential job seeker or staffer. For example, in one video interview, she replies to her set questions in German rather than English, and receives a high score from the AI.

She covers all sorts of tools, not just ones used to evaluate new hires, but others that fit into the entire HR lifecycle. And the “human” part of HR is becoming less evident as the bots take over. By take over, I don’t mean the Skynet path, but relying on automated solutions does present problems.

She raises this question: “Why are we automating a badly functioning system? In human hiring, almost 50 percent of new employees fail within the first year and a half. If humans have not figured out how to make good hires, why do we think automating this process will magically fix it?” She adds, “An AI skills-matching tool that is based on analyzing résumés won’t understand whether someone is really good at their job.” What about tools that flag teams that have had high turnover? It could be two polar opposite causes: a toxic manager or a tremendous manager that is good at developing talent and encouraging them to leave for greener pastures.

Having my own freelance writing and speaking business for more than 35 years, I have a somewhat different view of the hiring decision than many people. You could say that I either had infrequent times that I was hired for full-time employment, or that I face that decision multiple times a year whenever I get an inquiry from a new client, or a previous client that is now working for a new company. Some editors I have worked for decades as they have moved from pub to pub, for example. They hire me because they are familiar with my work and value my perspective and analysis that I bring to the party. No AI is going to figure that out anytime soon.

One of the tools that I have come across in the before-AI times is the DISC assessment that is part of the Myers-Briggs, which is a psychological tool that has been around for decades. I wrote about my test when I was attending a conference at Ford Motor Co. back in 2013. They were demonstrating how they use this tool to figure out the type of person who is most likely to buy any particular car model. Back in 2000, I wrote a somewhat tongue-in-cheek piece about how you can use Myer-Briggs to match up our personality with that of our computing infrastructure.

But deciding if someone is an introvert or an extrovert is a well-trod path, with plenty of testing experience over the decades. These AI-powered tools don’t have much of this history, are based on data sets that are shaky with all sorts of assumptions. For example HireVue’s facial analysis algorithm is trained on video interviews with people already employed by the company. That sounds like a good first step, but having done one of those one-sided video interviews — basically where you are just talking to the camera and not interacting with an actual human asking the question — means you aren’t getting any feedback from your interviewer, either with subtle facial or audio clues that are part of normal human discourse. Eventually, in 2021, the company stopped using both tone-of-voice and facial-based algorithms entirely, claiming that natural language processing had surpassed both of them.

Another example is capturing when you use your first person pronouns during the interview — I vs. we for example. Is this a proxy for what kind of team player you might be? HireVue says they base their analysis on thousands of questions such as this, which doesn’t make me feel any better about their algorithms. Just because a model has multiple parameters doesn’t necessarily make it better or more useful.

Then there is the whole dust-up on overcoming built-in AI bias, something that has been written about over the years going back to when Amazon first unleashed their AI hiring tool and found it selected white men more often. I am not going there in this post, but her treatment runs deep and shows the limitations of using AI, no matter how many variables they try to correlate with their models. What is important, something Mark Cuban touches on frequently with his posts, is that diverse groups of people produce better business results. And that diversity can be defined in various ways, not just race and gender, but by people with disabilities both mental and physical. The AI modelers have to figure out — as all modelers do — what is the connection between playing a game, or making a video recording, and how that relates to job performance? You need large and diverse training samples to pull this off, and even then you have to be careful about your own biases in constructing the models. She quotes one source who says, “Technology, in many cases, has enabled the removal of direct accountability, putting distance between human decision-makers and the outcomes of these hiring processes and other HR processes.”

Another dimension of the AI personnel assessment problem is the tremendous lack of transparency. Potential prospects don’t know what the AI-fueled tests entail, don’t know how they were scored or whether they were rejected from a job because of a faulty algorithm or bad training data or some other computational oddity.

When you step back and consider the sheer quantity of data that can be collected by an employer: keystrokes on your desktop, website cookies that record the timestamp of your visits, emails, Slack and Teams message traffic, even Fitbit tracking stats — it is very depressing. Do these captured signals reveal anything about your working habits, job performance, or anything really? HR folks are relying more and more on AI-assistance, and now can monitor just about every digital move that an employee makes in the workplace, even when that workplace is the dining room table and the computer is shared by the employee’s family. (There are several chapters on this subject in her book.)

This book will make you think about the intersection of AI and HR, and while there is a great deal of innovation happening, there is still much work to be done. As she says, context often gets lost. Her book will provide plenty of context for you to think about.

Book review: Micah Lee’s Hacks Leaks and Revelations

There has been a lot written about data leaks and the information contained therein, but few books that tell you how to do it yourself. That is the subject of Hacks, Leaks and Revelations that was recently published.

This is a very unique and interesting and informative book, written by Micah Lee, who is the director of information security for The Intercept and has written numerous stories about leaked data over the years, including a dozen articles on some of the contents of the Snowden NSA files. What is unique is that Lee will teach you the skills and techniques that he used to investigate these datasets, and readers can follow along and do their own analysis with this data and others such as emails from the far-right group Oath Keepers. There is also materials leaked from the Heritage Foundation, and chat logs from the Russian ransomware group Conti. This is a book for budding data journalists, as well as for infosec specialists who are trying to harden their data infrastructure and prevent future leaks from happening.

Many of these databases can be found on DDoSecrets, the organization that arose from the ashes of WikiLeaks and where Lee is an adviser.

Lee’s book is also unique in that he starts off his journey with ways that readers can protect their own privacy, and that of potential data sources, as well as ways to verify that the data is authentic, something that even many experienced journalists might want to brush up on. “Because so much of source protection is beyond your control, it’s important to focus on the handful of things that aren’t.” This includes deleting records of interviews, any cloud-based data or local browsing history for example. “You don’t want to end up being a pawn in someone else’s information warfare,” he cautions. He spends time explaining what not to publish or how to redact the data, using his own experience with some very sensitive sources.

One of the interesting facts that I never spent much time thinking about before reading Lee’s book is that while it is illegal to break into a website and steal data, it is perfectly legal for anyone to make a copy of that data once it has been made public and do your own investigation.

Another reason to read Lee’s book is that there is so much practical how-to information, explained in simple step-by-step terms that even computer neophytes can quickly implement them. Each chapter has a series of exercises, split out by operating system, with directions. A good part of the book dives into the command line interface of Windows, Mac and Linux, and how to harness the power of these built-in tools.

Along the way you’ll learn Python scripting to automate the various analytical tasks and use some of his own custom tools that he and his colleagues have made freely available. Automation — and the resulting data visualization — are both key, because the alternative is very tedious examination line by line of the data. He uses the example of searching the BlueLeaks data for “antifa” as an example (this is a collection of data from various law enforcement websites that document misconduct), making things very real. There are other tools such as Signal, an encrypted messaging app, and using BitTorrent. There is also advice on using disk encryption tools and password managers. Lee explains how they work and how he used them in his own data explorations.

One chapter goes into details about how to read other people’s email, which is a popular activity with stolen data.

The book ends with a series of case studies taken from his own reporting, showing how he conducted his investigations, what code he wrote and what he discovered. The cases include leaks from neo-Nazi chat logs, the anti-vax misinformation group America’s Frontline Doctors and videos leaked from the social media site Parler that were used during one of Trump’s impeachment trials. Do you detect a common thread here? These case studies show how hard data analysis is, but they also walk you through Lee’s processes and tools to illustrate its power as well.

Lee’s book is really the syllabus for a graduate-level course in data journalism, and should be a handy reference for beginners and more experienced readers. If you are a software developer, most of his advice and examples will be familiar. But if you are an ordinary computer user, you can quickly gain a lot of knowledge and see how one tool works with another to build an investigation. As Lee says, “I hope you’ll use your skills to discover and publish secret revelations, and to make a positive impact on the world while you’re at it.”

Book Review: Your Face Belongs to Us by Kashmir Hill

Author Logo“Instantaneous photographs and newspaper enterprise have invaded the sacred precincts of private and domestic life.” You might be surprised to find out that this quote is more than 130 years old, from a law review article co-authored by Louis Brandeis, and inspired by the invention of Kodak film. It appears in a new book “Your Face Belongs to Us,” by Kashmir Hill, a tech reporter for the NY Times. She chronicles the journey of digital facial recognition software, focusing on Clearview AI Inc. from scrappy startup to a powerful player in the field, and exposes their many missteps, failures, and successful inroads into becoming a potent law enforcement tool.

Clearview wasn’t the only tech firm to develop facial recognition software: Google, Facebook, Microsoft, IBM, Apple and Amazon all had various projects that they either developed internally or purchased (Google with Pittsburgh Pattern Recognition and Apple with Polar Rose for example). In either case, these projects were eventually stopped because they were afraid to deploy them, as Hill writes. Facebook, for example, had face recognition projects as early as 2010 “but could afford to bide its time until some other company broke through.” But Facebook didn’t delete the code but merely turned it off, leaving the door open for some future time when perhaps the technology would be more accepted.

She documents one of the biggest challenges: being able to identify people in various candid poses, with dim lighting, with poor resolution street surveillance cameras, and looking away from the ever-seeing lens. Another challenge is legal, with lawsuits coming at Clearview from literally all corners of the globe. Leading the charge is ACLU lawyer James Ferg-Cadima and the state of Illinois, which was an early adopter of biometric privacy.

Clearview has also brought many activists to protest and lobby for restrictions. One shared his opinion that “face recognition should be thought about in the same way we do about nuclear or biological weapons.” Clearview soon “became a punching bag for global privacy regulators,” she writes, and describes several efforts in Europe during the early 2020’s that resulted in various fines and restrictions placed on the company.

Police departments were early adopters of Clearview, thanks to today’s smartphone users that post everything about their lives. That has led to one series of legal challenges which was self-inflicted. Hill documents many cases where the wrong person was identified and then arrested, such as Robert Williams. “It wasn’t a simple matter of an algorithm making a mistake,” she writes. “It was a series of human beings making bad decisions, aided by fallible technology.” She wrote that one for a NY Times article entitled, “Wrongly Accused by an Algorithm.” In many of these wrongful arrest cases, the accused were black men, which could be tracked back to inadequate training data of non-white images. (Facebook had this problem for many years with its image recognition algorithm.)

Some of Clearview’s story is inextricably bound to Hill’s own investigations, where early on she tipped off the company about her interests and was initially blocked from learning more about their technology. Eventually, she would interview Clearview’s CEO Hoan Ton-That numerous times to connect the dots. “It was astonishing that Ton-That had gone from building banal Facebook apps to creating world-changing software,” she sums up his career.

The company was determined to “scrape” the web for personal photos, and today various sources claim they have accumulated more than 30 billion images. All of these images, as she points out, were collected without anyone’s explicit permission. This collection would become infamous and exemplify a world “in which people are prejudged based on choices they’ve made in the past, not their behavior in the present,” she wrote. You could say that on the internet, everyone knows you once were a dog.

She finds that Clearview created a “red list” which would remove certain VIPs from being tracked by its software by government edict. “Being unseen is a privilege.” Unfortunately, it is getting harder and harder to be unseen, because even if you petition Clearview to remote your images from their searches and from public web sources, they still have a copy buried deep within their database. Her book is an essential document about how this technology has evolved, and what we as citizens have to do to protect ourselves.

Book review: The Traitor by Ava Glass

The Traitor: A NovelAccidental superspy Emma is back in this second volume, which can be read independently of the author’s first book chronicling her adventures eluding her Russian counterparts. This time she is put on a Russian’s oligarch’s yacht to try to figure out the cause of one of her fellow secret agent’s death in London. Emma is a delightful character and this book adds to her allure as someone who can kick ass when she needs to but still figure out the subtle tells of the spies around her. The yacht is sailing between Monaco and Barcelona and is the site of numerous near-mishaps and espionage moments that are just a joy to read. The supporting cast from the first book is back making the plot points even more compelling. Highly recommended.

Book review: Containing Big Tech by Tom Kemp

Tom Kemp’s new book about the dangers of the five Big Tech companies is several books in one volume. Normally, this would not be a great recommendation, but stick with me here and see if you agree that he has written a very useful, effective, and interesting book.

It is a detailed history on how Microsoft, Google, Meta/Facebook, Amazon and Apple have become the tech powerhouses and near-monopolists with their stranglehold on digital services, at the same time threatening our privacy. It is a reference work for consumers who are concerned about what private information is shared by these vendors, and how to take back control over their data. It is also an operating manual for business IT managers and executives who are looking to comply with privacy regs and also to prevent their own sensitive data from leaking online. And it is a legislative to-do list for how to fashion better data and privacy protection for our digital future.

Kemp focuses on eight different areas of interest, one per chapter. For example, one chapter describes some startling failures at reigning in the data broker industry and another goes into details about how easily disinformation has prevailed and thrived in the past decade. He mixes his own experience as a tech entrepreneur, investor and executive with very practical matters. Each chapter has a section dealing with the issue, then the response of the various tech vendors, and finally a collection of various laws and proposals from both the EU and the US in response. This last section is a sad tale about the lack of legislative forward motion in the US and how the EU has forged ahead with their own laws in this area — only to be lightly enforced.

Speaking of legislation, I asked him what he thought about the lack of any progress in that department, especially at the US federal level. He told me in a recent interview that “No one is going to do anything to modify Section 230 — all previous efforts have been roundly beaten. Eventually, pressure is going to shift to EU, with its new laws that take effect in 2024. These will require online businesses to monitor their platforms for objectionable speech. These will also give end users the ability to flag content and make the tech vendors to be more transparent. Tech platforms will then have to finally respond. I don’t see anything happening in the US, nor with any new federal privacy laws enacted.”

His unique know-how and the combination of these different perspectives makes for a fascinating read. For example, to test Google’s claims that they have cleaned up their heavy-handed location monitoring, he did some role playing and set up appointments at local abortion clinics, visited drug stores, and shopped online. His online activity and location data was monitored by Google about every six minutes. “The real-time nature of this monitoring was impressive. Google knows the ads that they served me, the pages I visited, my Android phone notifications and locations. And despite their promises, they were logging all these details about me,” he said.

Even if you have been parsimonious about protecting your privacy, you probably don’t know that Meta’s tracking Pixel is used by a third of the world’s most popular websites and is at the heart of numerous privacy lawsuits, especially in Europe. Or the sequence of steps to tamp down on what the five tech vendors allow you to make your activities more private.

Kemp doesn’t pull any punches — he lays blame at the keyboards of these Big Tech vendors and our state and federal legislators. “Big Tech’s anticompetitive practices have also significantly contributed to them becoming these giants who act as gatekeepers to our digital economy,” he writes. “The five Big Tech firms have five of the seven largest cash balances of any S&P 500 company in 2022.”

He documents the missteps that the major tech vendors have taken, all in the service of their almighty algorithms and with the aim of increasing engagement, no matter the costs to society, or to its most at-risk members — namely children.

I asked him about the latest crop of studies that were paid for in part by Meta/Facebook and appeared in various technical journals (and covered here in the NY Times.) He told me, “Meta was closely involved in shaping this research and in setting the agenda. It wasn’t a neutral body – they framed the context and provided the data. Part of the problem is that the big tech platforms are talking out of both sides of their mouths. They market their platforms specifically to influence people to buy products from their advertisers. But then their public policy staffs have another message that says they don’t really influence people when bad things happen to them. They certainly haven’t helped the situation via algorithmic amplification of using their services.” I reminded him that many of the big tech trust and safety teams were one of the first groups to be fired when the most recent downturn happened.

So get a copy of this book now, both for yourself and your business. If you want to stay abreast of the issues he mentions, check out his website for post-publication updates, which is very helpful.

You may have taken some of the privacy-enhancing steps he outlines in one of the book’s appendices, but probably will learn some new tricks to hide your identity.

Book review: Elonka Dunin and Klaus Schmeh’s new Codebreaking edition

What do the authors Beatrix Potter, Rudyard Kipling, Edgar Allan Poe and the British composer Edward Elgar have in common with the Zodiac killer, Mary Queen of Scots, and an enigmatic map left in the 1880s by Virginian buffalo hunter named Thomas Beale? They all were fascinated by communicating by codes. And if you are too and want to learn how to break them yourself, you should pick up the latest expanded edition of Codebreaking by Elonka Dunin and Klaus Schmeh that is expected in September. Their book takes you through the codes used by these historical luminaries, some of them (such as one of the Zodiac messages or the mysterious Voynich manuscript) have never been broken. And there are plenty that have been solved, such as a single telegram that was decrypted and brought the US to enter WWI.

The book’s focus is on using your wits and pencil and paper to solve the puzzles for the most part, although the authors aren’t computer-adverse: they use the old fashioned methods to help develop the reader’s skills and to pay attention to the frequency distribution of the coded letters and symbols used in the messages, among other tricks of the trade that they describe in detail.

Dunin may be a familiar name to you: Years ago, I had an opportunity to meet her in person when she spoke at a conference in St. Louis. She is a very impressive person, and carries a deep history and understanding of the genre. She is tightly associated with an encoded sculpture that sits on the grounds of the CIA campus, which still contains an unsolved portion after decades of tries by the best and brightest cryptographers. Her co-author has written numerous books as well and maintains the Encrypted Books List that is a useful companion to learn more about the topic, along with providing numerous illustrations that begin each of the book’s chapters.

This book is nearly 500 pages, and chock full of illustrations of the original coded messages as well as other helpful materials that show how codebreakers figure things out. Because of this, I would recommend buying the printed copy rather than an ebook. Each chapter is devoted to particular techniques, such as “hill climbing” where you proceed to decode a word at a time and continually measure your progress. This technique has proven very successful at breaking historical codes and uses computer algorithms.

The authors were motivated to update their first edition because so many major codes have been cracked over the past few years — including the aforementioned one by Mary Queen of Scots. The stories of these escapades  is what makes this book entertaining as well as informative, and you realize that codebreaking is a team sport. The encoded message above, by the way, is one of the many Zodiac copycats who wrote messages to the police pretending to be the actual killer. See if you can work out what it says.

Book review: Blind Fear

Blind Fear: A Thriller (The Finn Thrillers Book 3) by [Brandon Webb, John David Mann]This is the third in the series of “fear” books featuring ex-Navy SEAL Finn in another escapade, this time in Puerto Rico in the process of saving two children who get caught up in a series of unfortunate events. Finn is trying to find the kids, who have been abducted on a snorkel trip. Meanwhile, two federales are searching for Finn and land on the island and start tracking him down. The characters, as with the previous two novels, are well drawn, the situations ultra realistic, the conflicts seemingly vexing. You don’t have to read the other books to get involved here, and if you are fans of Lee Child’s Reacher or Brad Thor’s books you will find this novel enjoyable and the pace the usual madcap and mayhem.

Book review: The edge of sleep

The Edge of Sleep: A Novel by [Jake Emanuel, Willie Block]This book is based on the podcast/TV series of the same name which has been out for several years. The thesis is that a worldwide plague hits when people go to sleep, so the obvious conceit is to stay awake to try to fight it and figure out an antidote. So we have the real-life pandemic to compare with the fictionalized version, and that may or may not sit well with some readers. We touch on several different groups of people in everyday situations around the world as they try to cope with the calamity, which I think works better in a TV version than trying to keep track of them throughout the novel. Think of it as a zombie apocalypse without the zombies, which has never been a favorite genre for me. The novel has some terrific descriptions and the plot takes us to some interesting places. In place of the hyper-science and politics of Covid, we have just ordinary folks who are trying to live their lives and cope with staying awake. Read on Amazon here.

Book review: Breaking Backbones Book 2 by Deb Radcliff

I have known Deb Radcliff as a B2B journalist colleague and now cyber fiction author for more than a decade. Her latest novel in the “Breaking Backbones” series can be read independent of the first volume, and is a sizzler taken directly from today’s cybersecurity news. We have mostly the same motley cast of characters of hackers, ne’er-do-wells, and tough dudes who are trying to mess up the world now that its central IT authority GlobeCom was taken down at the end of the first book. The various hacker clans are trying desperately to free a bunch of imprisoned programmers somewhere in Russia and stop the evil doers from unleashing their AI-based code on the world. In the meantime, there are plenty of drone attacks to manage, code to review, and personal scores to be settled. There is plenty of dystopia to be served up in its pages, and a great deal of verisimilitude thanks to Radcliff’s familiarity with the subject matter. Will her world be successful at freeing its digital enslavement from a crazy autocrat? Well, I won’t give away the ending, but it sure was fun reading about it.

Book review: Visual Threat Intelligence by Thomas Roccia

Thomas Roccia has written an interesting book called Visual Threat Intelligence that is both unusual and informative for security researchers of all experience levels. He is a Senior Security Researcher at Microsoft’s Threat Intelligence group, and the founder and curator of Unprotect.it, a database of malware evasion techniques.

Think of it as both a reference guide as well as a collection of carefully curated tools that can help infosec researchers get smarter about understanding potential threats (such as YARA, Sigma, and log analyzers) and the ways in which criminals use them to penetrate your networks.

For threat intel beginners, he describes the processes involved in breach investigation, how you gather information and vet it, and weigh various competing hypotheses to come up with what actually happened across your computing infrastructure. He then builds on these basics with lots of useful and practical methods, tools, and techniques.

One chapter goes into detail about the more notorious hacks of the past, including Stuxnet, the 2014 Sony hack, and WannaCry. There are timelines of what happened when, graphical representations of how the attack happened (such as the overview of the Shamoon atttack shown here), mapping the attack to the diamond model (focusing on adversaries, infrastructure, capabilities, and victims) and a summary of the MITRE ATT&CK tactics. That is a lot of specific information that is presented in a easily readable manner. I have been writing about cybersecurity for many years and haven’t seen such a cogent collection in one place of these more infamous attacks.

Roccia also does a deeper dive into his own investigation of NotPetya for two weeks during the summer of 2017. “It was the first time in my career that I fully realized the wide-ranging impact of a cyberattack — not only on data but also on people,” he wrote.

The book’s appendix contains a long annotated list of various open source tools useful for threat intel analysts. I highly recommend the book if you are interested in learning more about the subject and are looking for a very practical guide that you can use in your own investigations.