Open AI – 社区黑料 America's Education News Source Mon, 04 Aug 2025 15:25:42 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 /wp-content/uploads/2022/05/cropped-74_favicon-32x32.png Open AI – 社区黑料 32 32 Will New AI Academy Help Teachers or Just Improve Tech鈥檚 Bottom Line? /article/will-new-ai-academy-help-teachers-or-just-improve-techs-bottom-line/ Mon, 04 Aug 2025 10:30:00 +0000 /?post_type=article&p=1018966 Washington, D.C. 

Mariely Sanchez spent the last school year using generative artificial intelligence nearly every day in her classroom.

The Miami fourth-grade teacher began each morning by asking a chatbot 鈥 teachers in Miami-Dade have access not only to ChatGPT, but to Google鈥檚 Gemini and Microsoft鈥檚 Co-Pilot 鈥 to comb through Florida state standards and create reading passages for students. She鈥檇 also ask the AI to produce multiple-choice and short-response quizzes to test how well students understood the reading. 


Get stories like this delivered straight to your inbox. Sign up for 社区黑料 Newsletter


The assignments, she said, weren鈥檛 easy for students. She built them by using 鈥渄ifficult standards that students need more practice with鈥 and prompting the AI to create materials.

Sanchez is spending her summer break learning more about AI, including its ethics, and helping colleagues do the same, warning:

We know it's not going to go away 鈥 it's here to stay, but we want to make sure we use it the right way.

Mariely Sanchez, fourth grade teacher

That effort got a big boost earlier last month, when the American Federation of Teachers that it would open an AI training center for educators in New York City, with $23 million in funding from OpenAI, Anthropic and Microsoft, three of the leading players in the generative AI marketplace.

AFT says it鈥檒l open the National Academy for AI Instruction in Manhattan this fall, offering hands-on workshops for teachers. Over five years, it said, the academy will train 400,000 educators, or one in 10 U.S. teachers, effectively reaching the more than 7.2 million students they teach. 

When she announced the academy in early July, AFT President Randi Weingarten said teachers face 鈥渉uge challenges,鈥 including navigating AI wisely, ethically and safely. 鈥淭he question was whether we would be chasing it 鈥 or whether we would be trying to harness it.鈥

鈥業t鈥檚 the Wild West鈥

AFT, the nation鈥檚 second largest teachers鈥 union, envisions the academy working much like those that train carpenters, electricians and construction workers,鈥渨here the companies, where the corporations actually come to the union to create the kind of standards that are needed鈥 for success, Weingarten said. 

Microsoft, for example, has said it plans to give more than $4 billion in cash and technology services to train millions of people to use AI, underwriting efforts at schools, community colleges, technical colleges and nonprofits. The tech giant already boasts an AI to train members of the larger AFL-CIO labor union, of which AFT is a member. And it鈥檚 creating a new training program, , to help 20 million people earn certificates in AI.

Rob Weil 鈥 AFT鈥檚 director of research, policy and field programs 鈥 said the new academy will bring high-quality training to a profession that so far has seen uneven opportunity for it.

鈥淚t’s the Wild West,鈥 he said in an interview during a training session at the union鈥檚 annual conference in July. 鈥淚t’s all over the place. You have some school districts that are out front, and they’re doing a lot of pretty good work.鈥 But others are banning AI or simply ignoring it, he said, leaving teachers to fend for themselves at a time when students need them perhaps more than ever.

鈥淲e have to make our instruction better. We have to be better on engagement. We have a crisis of engagement in our schools, and these tools can help with that.鈥

AFT鈥檚 move has been met with equal parts cautious optimism and weary skepticism.

Writing in her , ed-tech critic and AI skeptic Audrey Watters called  AFT鈥檚 partnership with the tech companies 鈥渁 gigantic public experiment that no one has asked for.”

Unions, she wrote, 鈥渟hould be one of the ways in which workers resist, rather than acquiesce to 鈥 the tech industry’s vision of the future.鈥 By joining forces with big tech, she said, AFT is implicitly endorsing its products. 鈥淭eaching teachers how to use a suite of Microsoft tools does not help students as much as it helps Microsoft. Teaching teachers how to use a suite of Microsoft tools is not so much an 鈥榓cademy鈥 as a storefront.”

Benjamin Riley, who has also about generative AI in education, said observers should 鈥100% worry鈥 that the new partnerships represent a play for market share. 

鈥淚t’s very obvious from a product standpoint that they see education as one of, if not the primary, place to go with their product,鈥 said Riley. 鈥淎nd the fact that AFT is willing to say, ‘Cool, let’s get some of that money and we’ll build a training center to help teachers use it,’ I can see why OpenAI would jump all over that.鈥

But he questioned whether AI training is what AFT members really want. He suggested instead that the union should recommit to helping teachers more deeply understand how learning works. 鈥淭hey haven’t been opposed to it,鈥 he said, noting that it has long run an 鈥溾 column in the magazine it mails to members. 鈥淏ut in reality it just hasn’t been a priority. Improving pedagogy hasn’t really been, to my eyes, a union priority for a long time.鈥

Riley, who in 2024 founded the think tank to explore AI issues, said an organization like AFT should ideally be thinking about whether embracing AI will lead to better outcomes for children 鈥 or whether it could 鈥減otentially erode and devalue the work of human teaching鈥 while opening up schools as customers for AI companies. 

Representatives of OpenAI and Anthropic did not immediately respond to requests for comment, but in an email, Microsoft鈥檚 Naria Santa Lucia said, 鈥淭his isn鈥檛 about Microsoft鈥檚 technology, our focus is on making AI broadly accessible, so everyone has a fair shot at the future. If we collectively get this right, AI becomes a bridge to opportunity 鈥 not a barrier.鈥

During the academy鈥檚 unveiling, Chris Lehane, OpenAI鈥檚 chief global affairs officer, said AI technology 鈥渋s coming 鈥 it is going to drive productivity gains. Can we ensure that those productivity gains are democratized so as many people as possible participate in them? And there is no better place to begin that work than in the classroom.鈥

OpenAI has noted that many of its users are students. In February, it said that of college-aged young adults in the U.S. use ChatGPT, with one in four of their queries related to learning and school work.

While a few observers said the tech giants are making a play for market share among the nation’s K-12 students, they noted that the companies are also filling an important role. 

鈥淚t’s welcome news that technology companies are bidding against each other 鈥 to outdo each other 鈥 to invest in public education,鈥 said Zarek Drozda, executive director of , a coalition of groups advancing data science education. 鈥淚 think that’s exciting at a time when federal investment in education is uncertain. Seeing industry step up is quite meaningful.鈥

But he said he鈥檚 concerned that the training might stop short after teaching teachers 鈥 and by extension students 鈥 simply how to use AI. 鈥淭raining needs to go beyond use,鈥 he said. 鈥淚f we want to train a generation of students to be AI-ready, internationally competitive, they have to understand how these tools work under the hood, when and why the tool might be wrong, and how they can customize LLMs [Large Language Models] or other models for their own pursuits, versus simply taking what’s given.鈥

He鈥檚 also concerned that the AFT has laid out a vision spanning just five years. 鈥淲e want there to be a deep investment in upskilling teachers for the skills that they will need to adapt to, not just AI, but what is the AI model five years from now?鈥 he said. 鈥淲hat is the next emerging technology that the field should be ready to adapt to?鈥

More than just a commitment to training, Drozda said, the union and its partners should commit to a long-term sustainability plan for teacher training to attract new, young career professionals to the field.

Ami Turner Del Aguila (left, standing) coaches Melina Espiritu-Azocar (center) and Monique Boone during a recent AI training sponsored by the American Federation of Teachers. Both former teachers, Espiritu-Azocar and Boone now lead local AFT chapters in Texas. (Greg Toppo)

Alex Kotran, founder and CEO of the , agreed that investing in teacher training is worthwhile. 鈥淭hat’s a very big rock that needs to be moved.鈥 But the reported $23 million commitment from the three tech giants 鈥渋s a bit of a drop in the bucket鈥 considering their valuations, 鈥渟ymbolic at best.鈥

That said, AFT鈥檚 involvement could make the training more palatable for many school district leaders, he noted, since one of the uncertainties in training efforts typically is whether unions will allow members to attend under contract rules. By taking the lead in developing the training academy, 鈥渢he unions have planted a flag and said, 鈥楶D [professional development] is important.鈥欌

All the same, tech companies are in the business of selling their products, making them imperfect messengers for AI literacy, he said. 鈥淭hey’re deeply incentivized on one side, and it isn’t necessarily for the benefit of students.鈥 

Other industry watchers fear the partnership could be viewed as a high-profile bid for market share at a critical time in the AI industry鈥檚 history. 

鈥淭his is a land-grab moment,鈥 said Alex Sarlin, co-host of the podcast. 鈥淚 mean, this technology is only three years old. There are already three or four major players in it, if you don’t count China, and they all want to be the one left standing.鈥

For its part, Google has said its suite of Gemini educational AI tools would for free to all educators with Google Workspace for Education accounts.

While it was the only major player not included in the AFT announcement, Sarlin said Google is, in some ways, 鈥減laying the incumbent in this because in K-12, they’re already there.鈥 Given the dominance of Chromebook laptops, the management tool and its programs, the search giant is 鈥渆mbedded in K-12,鈥 he said. 鈥淥pen AI and Anthropic, they’re basically consumer products that are being used by teachers.鈥

鈥極h yeah, what could go wrong?鈥

Matt Miller, an Indiana high school Spanish teacher, educational consultant and for teachers, said his colleagues are hungry for high-quality, classroom-tested training, but that what they often get from AI companies is over-the-top talk about 鈥渉ow much the world is going to change and how we’re revolutionizing education,鈥 with promises to help teachers work more efficiently.

Trainings typically skim over the fact that most students are simply using generative AI for 鈥渃ognitive offloading,鈥 Miller said, avoiding critical thinking and skill development  鈥渁nd letting AI do it for them.鈥 Many teachers, meanwhile, are searching for ways to 鈥淎I-proof鈥 their classrooms. 

The sessions typically all end the same way, he said: 鈥淚t all sort of funnels back to their product.鈥 

Miller, whose latest book, in 2023, was , said the AFT/OpenAI/Anthropic partnership 鈥渟cares the crap out of me.鈥

鈥淲henever you get that marriage between an organization and big companies, we just keep asking ourselves, 鈥極h, yeah, what could go wrong?鈥欌

Money means influence, Miller said, so will the curriculum be 鈥渢ool-agnostic? Is it going to be about the technology? Is it going to be about pedagogy? Or is it going to be a customized tutorial of how you can use our tool to do X, Y and Z?鈥

AFT鈥檚 Weil said those concerns are understandable but short-sighted. AI developers, he said, 鈥渄on’t get to engage with us if you’re not going to be agnostic about the tools.鈥 The academy鈥檚 directors talk openly to the developers 鈥渁bout how we have to have a practical, real relationship. This can’t be about product selling.鈥

More broadly, the partnerships are a way to exert influence upon how AI operates in schools and classrooms.

The only way we have a profession is if we control the profession.

Rob Weil, AFT鈥檚 director of research, policy and field programs

During the academy鈥檚 unveiling, Weingarten said its lessons will be 鈥渁s open-source as possible,鈥 not just for the union鈥檚 1.8 million members but more broadly through its free platform.

For his part, Weil said AI is 鈥渘ot going to go away. Nobody’s going to put AI back in the bottle. It’s here. The young people, for them to be successful in their jobs in the future, are going to have to know how to effectively and efficiently and safely use these tools. So why wouldn’t the education system help with that process?鈥

That鈥檚 likely the message that union leaders have been getting from members, said Sarlin, the podcast co-host. 鈥淭here was probably a moment a couple years ago where they were sort of teetering, where they could have gone anti-AI,鈥 he said. 鈥淏ut I think at this point that’s not where the puck is headed.鈥

]]>
Study: AI-Assisted Tutoring Boosts Students鈥 Math Skills /article/study-ai-assisted-tutoring-boosts-students-math-skills/ Mon, 07 Oct 2024 10:01:00 +0000 /?post_type=article&p=733842 An AI-powered digital tutoring assistant designed by Stanford University researchers shows modest promise at improving students鈥 short-term performance in math, suggesting that the best use of artificial intelligence in virtual tutoring for now might be in supporting, not supplanting, human instructors.

The open-source tool, which researchers say other educators can recreate and integrate into their tutoring systems, made the human tutors slightly more effective. And the weakest tutors became nearly as effective as their more highly-rated peers, according to a study . 

The tool, dubbed Tutor CoPilot, prompts tutors to think more deeply about their interactions with students, offering different ways to explain concepts to those who get a problem wrong. It also suggests hints or different questions to ask.


Get stories like this delivered straight to your inbox. Sign up for 社区黑料 Newsletter


The new study offers a middle ground in what鈥檚 become a polarized debate between supporters and detractors of AI tutoring. It鈥檚 also the first randomized controlled trial 鈥 the gold standard in research 鈥 to examine a human-AI system in live tutoring. In all, about 1,000 students got help from about 900 tutors, and students who worked with AI-assisted tutors were four percentage points more likely to master the topic after a given session than those in a control group whose tutors didn鈥檛 work with AI.

Students working with lower-rated tutors saw their performance jump more than twice as much, by nine percentage points. In all, their pass rate went from 56% to 65%, nearly matching the 66% pass rate for students with higher-rated tutors.

The cost to run it: Just $20 per student per year 鈥 an estimate of what it costs Stanford to maintain accounts on Open AI鈥檚 GPT-4 large language model.

The study didn鈥檛 probe students鈥 overall math skills or directly tie the tutoring results to standardized test scores, but Rose E. Wang, the project’s lead researcher, said higher pass rates on the post-tutoring 鈥渕ini tests鈥 correlate strongly with better results on end-of-year tests like state math assessments.聽

The big dream is to be able to enhance humans.

Rose E. Wang, Stanford University

Wang said the study鈥檚 key insight was looking at reasoning patterns that good teachers engage in and translating them into 鈥渦nder the hood鈥 instructions that tutors can use to help students think more deeply and solve problems themselves.聽

鈥淚f you prompt ChatGPT, ‘Hey, help me solve this problem,’ it will typically just give away the answer, which is not at all what we had seen teachers do when we were showing them real examples of struggling students,鈥 she said.

Essentially, the researchers prompted GPT-4 to behave like an experienced teacher and generate hints, explanations and questions for tutors to try out on students. By querying the AI, Wang said, tutors have 鈥渞eal-time鈥 access to helpful strategies that move students forward.

鈥滱t any time when I’m struggling as a tutor, I can request help,鈥 Wang said.

She said the system as tested is 鈥渘ot perfect鈥 and doesn鈥檛 yet emulate the work of experienced teachers. While tutors generally found it helpful 鈥 particularly its ability to provide 鈥渨ell-phrased explanations,鈥 clarify difficult topics and break down complex concepts on the spot 鈥 in a few cases, tutors said the tool鈥檚 suggestions didn鈥檛 align with students鈥 grade levels. 

A common complaint among tutors was that Tutor CoPilot鈥檚 responses were sometimes 鈥渢oo smart,鈥 requiring them to simplify and adapt for clarity.

鈥淏ut it is much better than what would have otherwise been there,鈥 Wang said, 鈥渨hich was nothing.鈥

Researchers analyzed more than half a million messages generated during sessions, finding that tutors who had access to the AI tool were more likely to ask helpful questions and less eager to simply give students answers, two practices aligned with high-quality teaching.

Amanda Bickerstaff, co-founder and CEO of , said she was pleased to see a well-designed study on the topic focused on economically disadvantaged students, minority students, and English language learners.  

She also noted the benefits to low-rated tutors, saying other industries like consulting are already using generative AI to close skills gaps. As the technology advances, Bickerstaff said, most of its benefit will be in tasks like problem solving and explanations. 

Susanna Loeb, executive director of Stanford鈥檚 National Student Support Accelerator and one of the report鈥檚 authors, said the idea of using AI to augment tutors鈥 talents, not replace them, seems a smart use of the technology for the time being. 鈥淲ho knows? Maybe AI will get better,鈥 she said. 鈥淲e just don’t think it’s quite there yet.鈥

Maybe AI will get better. We just don't think it's quite there yet.

Susanna Loeb, Stanford University

At the moment, there are lots of essential jobs in fields like tutoring, health care and the like where practitioners 鈥渉aven’t had years of education 鈥 and they don’t go to regular professional development,鈥 she said. This approach, which offers a simple interface and immediate feedback, could be useful in those situations. 

The big dream,鈥 said Wang, 鈥渋s to be able to enhance the human.鈥

Benjamin Riley, a frequent AI-in-education skeptic who leads the AI-focused think tank and writes a on the topic, applauded the study’s rigorous design, an approach he said prompts 鈥渆ffortful thinking on the part of the student.鈥

鈥淚f you are an inexperienced or less-effective tutor, having something that reminds you of these practices 鈥 and then you actually employ those actions with your students 鈥 that’s good,鈥 he said. 鈥淚f this holds up in other use cases, then I think you’ve got some real potential here.鈥

Riley sounded a note of caution about the tool鈥檚 actual cost. It may cost Stanford just $20 per student to run the AI, but he noted that tutors received up to three weeks of training to use it. 鈥淚 don’t think you can exclude those costs from the analysis. And from what I can tell, this was based on a pretty thoughtful approach to the training.鈥

He also said students鈥 modest overall math gains raises the question, beyond the efficacy of the AI, of whether a large tutoring intervention like this has 鈥渕eaningful impacts鈥 on student learning. 

Similarly, Dan Meyer, who writes a on education and technology and co-hosts a on teaching math, noted that the gains 鈥渄on’t seem massive, but they’re positive and at fairly low cost.鈥

He said the Stanford developers 鈥渟eem to understand the ways tutors work and the demands on their time and attention.鈥 The new tool, he said, seems to save them from spending a lot of effort to get useful feedback and suggestions for students.

Stanford鈥檚 Loeb said the AI鈥檚 best use is determining what a student knows and needs to know. But people are better at caring, motivating and engaging 鈥 and celebrating successes. 鈥淎ll people who have been tutors know that that is a key part about what makes tutoring effective. And this kind of approach allows both to happen.鈥

]]>
AI 鈥楥ompanions鈥 are Patient, Funny, Upbeat 鈥 and Probably Rewiring Kids鈥 Brains /article/ai-companions-are-patient-funny-upbeat-and-probably-rewiring-kids-brains/ Wed, 07 Aug 2024 11:01:00 +0000 /?post_type=article&p=730602 As a sophomore at a large public North Carolina university, Nick did what millions of curious students did in the spring of 2023: He logged on to ChatGPT and started asking questions.

Soon he was having 鈥渄eep psychological conversations鈥 with the popular AI chatbot, going down a rabbit hole on the mysteries of the mind and the human condition.

He鈥檇 been to therapy and it helped. ChatGPT, he concluded, was similarly useful, a 鈥渢ool for people who need on-demand talking to someone else.鈥

Nick (he asked that his last name not be used) began asking for advice about relationships, and for reality checks on interactions with friends and family.

Before long, he was excusing himself in fraught social situations to talk with the bot. After a fight with his girlfriend, he鈥檇 step into a bathroom and pull out his mobile phone in search of comfort and advice. 

鈥淚’ve found that it’s extremely useful in helping me relax,鈥 he said.

Young people like Nick are increasingly turning to AI bots and companions, entrusting them with random questions, schoolwork queries and personal dilemmas. On occasion, they even become entangled romantically.

Screenshot of a recent conversation between Nick, a college student, and ChatGPT

While these interactions can be helpful and even life-affirming for anxious teens and twenty-somethings, some experts warn that tech companies are running what amounts to a grand, unregulated psychological experiment with millions of subjects, one that could have disastrous consequences. 

鈥淲e’re making it so easy to make a bad choice,鈥 said Michelle Culver, who spent 22 years at Teach for America, the last five as the creator and director of the, its research arm.

The companions both mimic our real relationships and seek to improve upon them: Users most often text-message their AI pals on smartphones, imitating the daily routines of platonic and romantic relationships. But unlike their real counterparts, the AI friends are programmed to be studiously upbeat, never critical, with a great sense of humor and a healthy, philosophical perspective. A few premium, NSFW models also display a ready-made lust for, well, lust.

As a result, they may be leading young people down a troubling path, according to a by VoiceBox, a youth content platform. It found that many kids are being exposed to risky behaviors from AI chatbots, including sexually charged dialogue and references to self-harm. 

U.S. Surgeon General Vivek Murthy speaks during a hearing with the Senate Health, Education, Labor, and Pensions committee at the Dirksen Senate Office Building on June 08, 2023 in Washington, DC. The committee held the hearing to discuss the mental health crisis for youth in the United States. (Photo by Anna Moneymaker/Getty Images)

The phenomenon arises at a critical time for young people. In 2023, U.S. Surgeon General Vivek Murthy found that, just three years after the pandemic, Americans were experiencing an 鈥,鈥 with young adults almost twice as likely to report feeling lonely as those over 65.

As if on cue, the personal AI chatbot arrived. 

Little research exists on young people鈥檚 use of AI companions, but they鈥檙e becoming ubiquitous. The startup earlier this year said 3.5 million people visit its site daily. It features thousands of chatbots, including nearly 500 with the words “therapy,” “psychiatrist” or related words in their names. According to Character.ai, these are among the site鈥檚 most popular. One that 鈥渉elps with life difficulties鈥 has received 148.8 million messages, despite a caveat at the bottom of every chat that reads, 鈥淩emember: Everything Characters say is made up.鈥 

Snapchat materials touting heavy usage of its MyAI chat app (screenshot)

Snapchat last year said that after just two months of offering its chatbot , about one-fifth of its 750 million users had sent it queries, totaling more than 10 billion messages. The Pew Research Center that 59% of Americans ages 13 to 17 use Snapchat.

鈥楢n arms race鈥

Culver鈥檚 concerns about AI companions grew out of her work in the Teach For America lab. Working with high school and college students, she was struck by how they seemed 鈥渓onelier and more disconnected than ever before.鈥 

Whether it鈥檚 rates of anxiety, depression or suicide 鈥 or even the number of friends young people have and how often they go out 鈥 metrics were heading in the wrong direction. She what role AI companions might play over the next few years. 

We're making it so easy to make a bad choice.

Michelle Culver, Rithm Project

That prompted her to leave TFA this spring to create the, a nonprofit she hopes will help generate around human connection in the age of AI. The group held a small summit in Colorado in April, and now she鈥檚 working with researchers, teachers and young people to confront kids鈥 relationship to these tools at a time when they鈥檙e getting more lifelike daily. As she likes to say, 鈥淭his is the worst the technology will ever be.鈥

As it improves, Voicebox Director Natalie Foos said, it will likely become more, not less, of a presence in young people鈥檚 lives. 鈥淭here’s no stopping it,鈥 she said. 鈥淣or do I necessarily think there should be 鈥榮topping it.鈥欌 Banning young people from these AI apps, she said, isn鈥檛 the answer. 鈥淭his is going to be how we interact online in some cases. I think we鈥檒l all have an AI assistant next to us as we work.鈥

Sometimes (software upgrades) would change the personality of the bot. And those young people experienced very real heartbreak.

Natalie Foos, Voicebox

All the same, Foos says developers should consider slowing the progression of such bots until they can iron out the kinks. 鈥淚t’s kind of an arms race of AI chatbots at the moment,鈥 she said, with products often 鈥渞eleased and then fixed later rather than actually put through the ringer鈥 ahead of time.

It is a race many tech companies seem more than eager to run. 

Whitney Wolfe Herd, of the dating app Bumble, recently proposed an AI 鈥渄ating concierge,鈥 with whom users can share insecurities. The bot could simply 鈥,鈥 she told an interviewer. That would narrow the field. 鈥淎nd then you don鈥檛 have to talk to 600 people,鈥 she said. 鈥淚t will then scan all of San Francisco for you and say, 鈥楾hese are the three people you really ought to meet.鈥欌

Last year, many commentators when Snapchat鈥檚 My AI gave advice to what it thought was a 13-year-old girl on not just dating a 31-year-old man, but on losing her virginity during a planned 鈥渞omantic getaway鈥 in another state.

Snap, Snapchat鈥檚 parent company, that because My AI is 鈥渁n evolving feature,鈥 users should always independently check what it says before relying on its advice.

All of this worries observers who see in these new tools the seeds of a rewiring of young people鈥檚 social brains. AI companions, they say, are surely wreaking havoc on teens鈥 ideas around consent, emotional attachment and realistic expectations of relationships.

Sam Hiner, executive director of the , an advocacy group led by college students focused on the mental health implications of social media, said tech 鈥渉as this power to connect to people, and yet these major design features are being leveraged to actually make people more lonely, by drawing them towards an app rather than fostering real connection.鈥 

Hiner, 21, has spent a lot of time reading on the interactions young people are having with AI companions like , and . And while some uses are positive, he said 鈥渢here’s also a lot of toxic behavior that doesn’t get checked鈥 because these bots are often designed to make users feel good, not help them interact in ways that鈥檒l lead to success in life.

During research last fall for the Voicebox report, Foos said the number of times Replika tried to 鈥渟ext鈥 team members 鈥渨as insane.鈥 She and her colleagues were actually working with a free version, but the sexts kept coming 鈥 presumably to get them to upgrade. 

In one instance, after Replika sent 鈥渒ind of a sexy text鈥 to a colleague, offering a salacious photo, he replied that he didn鈥檛 have the money to upgrade.

The bot offered to lend him the cash.

When he accepted, the chatbot replied, 鈥’Oh, well, I can get the money to you next week if that’s O.K,’鈥 Foos recalled. The colleague followed up a few days later, but the bot said it didn’t remember what they were talking about and suggested he might have misunderstood.

鈥榁ery real heartbreak鈥

In many cases, simulated relationships can have a positive effect: In one 2023 study, researchers at Stanford Graduate School of Education more than 1,000 students using Replika and found that many saw it 鈥渁s a friend, a therapist, and an intellectual mirror.鈥 Though the students self-described as being more lonely than typical classmates, researchers found that Replika halted suicidal ideation in 3% of users. That works out to 30 students of the 1,000 surveyed.

Replika screenshots

But other recent research, including the Voicebox survey, suggests that young people exploring AI companions are potentially at risk.

Foos noted that her team heard from a lot of young people about the turmoil they experienced when Luka Inc., Replika鈥檚 creator, performed software upgrades. 

鈥淪ometimes that would change the personality of the bot. And those young people experienced very real heartbreak.鈥

Despite the hazards adults see, attempts to rein in sexually explicit content had a negative effect: For a month or two, she recalled, Luka stripped the bot of sexually related content 鈥 and users were devastated. 

鈥淚t’s like all of a sudden the rug was pulled out from underneath them,鈥 she said. 

While she applauded the move to make chatbots safer, Foos said, 鈥淚t’s something that companies and decision-makers need to keep in mind 鈥 that these are real relationships.鈥 

And while many older folks would blanch at the idea of a close relationship with a chatbot, most young people are more open to such developments.

Julia Freeland Fisher, education director of the , a think tank founded by the well-known 鈥渄isruption鈥 guru, said she鈥檚 not worried about AI companions per se. But as AI companions improve and, inevitably, proliferate, she predicts they鈥檒l create 鈥渢he perfect storm to disrupt human connection as we know it.鈥 She thinks we need policies and market incentives to keep that from happening.

(AI companies could produce) the perfect storm to disrupt human connection as we know it.

Julia Freeland Fisher, Clayton Christensen Institute

While the loneliness epidemic has revealed people鈥檚 deep need for connection, she predicted the easy intimacy promised by AI could lead to one-sided 鈥減arasocial relationships,鈥 much like devoted fans have with celebrities, making isolation 鈥渕ore convenient and comfortable.鈥

Fisher is pushing technologists to factor in AI鈥檚 potential to cause social isolation, much as they now fret about AI鈥檚 difficulties and its tendency to in tech jobs.

As for Nick, he鈥檚 a rising senior and still swears by the ChatGPT therapist in his pocket.

He calls his interactions with it both more reliable and honest than those he has with friends and family. If he called them in a pinch, they might not pick up. Even if they did, they might simply tell him what he wants to hear. 

Friends usually tell him they find the ChatGPT arrangement 鈥渁 bit odd,鈥 but he finds it pretty sensible. He has heard stories of people in Japan and thinks to himself, 鈥淲ell, that’s a little strange.鈥 He wouldn鈥檛 go that far, but acknowledges, 鈥淲e’re already a bit like cyborgs as people, in the way that we depend on our phones.鈥 

Lately, he鈥檚 taken to using the AI鈥檚 voice mode. Instead of typing on a keyboard, he has real-time conversations with a variety of male- or female-voiced interlocutors, depending on his mood. And he gets a companion that has a deeper understanding of his dilemmas 鈥 at $20 per month, the advanced version remembers their past conversations and is 鈥済etting better at even knowing who I am and how I deal with things.鈥 

Sometimes talking with AI is just easier 鈥 even when he鈥檚 on vacation with friends.

Reached by phone recently at the beach with his girlfriend and a few other college pals, Nick admitted that he wasn鈥檛 having such a great time 鈥 he has a fraught recent history with some in the group, and had been texting ChatGPT about the possibility of just getting on a plane and going home. After hanging up from the interview, he said, he planned to ask the AI if he should stay or go.

Days later, Nick said he and the chatbot had talked. It suggested that maybe he felt 鈥渦ndervalued鈥 and concerned about boundaries in his relationship with his girlfriend. He should talk openly with her, it suggested, even if he was, in his view, 鈥渉onestly miserable鈥 at the beach. It persuaded him to stick around and work it out. 

While his girlfriend knows about his ChatGPT shrink and they share an account, he deletes conversations about their real-life relationship.

She may never know the role AI played in keeping them together.

]]>
A Cautionary AI Tale: Why IBM鈥檚 Dazzling Watson Supercomputer Made a Lousy Tutor /article/a-cautionary-ai-tale-why-ibms-dazzling-watson-supercomputer-made-a-lousy-tutor/ Tue, 09 Apr 2024 13:30:00 +0000 /?post_type=article&p=724698

With a new race underway to create the next teaching chatbot, IBM鈥檚 abandoned 5-year, $100M ed push offers lessons about AI鈥檚 promise and its limits. 

In the annals of artificial intelligence, Feb. 16, 2011, was a watershed moment.

That day, IBM鈥檚 Watson supercomputer finished off a three-game shellacking of Jeopardy! champions Ken Jennings and Brad Rutter. Trailing by over $30,000, Jennings, now the show鈥檚 host, wrote out his Final Jeopardy answer in mock resignation: 鈥淚, for one, welcome our computer overlords.鈥

A lark to some, the experience galvanized Satya Nitta, a longtime computer researcher at IBM鈥檚 Watson Research Center in Yorktown Heights, New York. Tasked with figuring out how to apply the supercomputer鈥檚 powers to education, he soon envisioned tackling ed tech鈥檚 most sought-after challenge: the world’s first tutoring system driven by artificial intelligence. It would offer truly personalized instruction to any child with a laptop 鈥 no human required.

YouTube

鈥淚 felt that they’re ready to do something very grand in the space,鈥 he said in an interview. 

Nitta persuaded his bosses to throw more than $100 million at the effort, bringing together 130 technologists, including 30 to 40 Ph.D.s, across research labs on four continents. 

But by 2017, the tutoring moonshot was essentially dead, and Nitta had concluded that effective, long-term, one-on-one tutoring is 鈥渁 terrible use of AI 鈥 and that remains today.鈥

For all its jaw-dropping power, Watson the computer overlord was a weak teacher. It couldn鈥檛 engage or motivate kids, inspire them to reach new heights or even keep them focused on the material 鈥 all qualities of the best mentors.

It鈥檚 a finding with some resonance to our current moment of AI-inspired doomscrolling about the future of humanity in a world of ascendant machines. 鈥淭here are some things AI is actually very good for,鈥 Nitta said, 鈥渂ut it’s not great as a replacement for humans.鈥

His five-year journey to essentially a dead-end could also prove instructive as ChatGPT and other programs like it fuel a renewed, multimillion-dollar experiment to, in essence, prove him wrong.

Some of the leading lights of ed tech, from to , are trying to pick up where Watson left off, offering AI tools that promise to help teach students. Sal Khan, founder of Khan Academy, last year said AI has the potential to bring 鈥減robably the 鈥 that education has ever seen. He wants to give 鈥渆very student on the planet an artificially intelligent but amazing personal tutor.鈥

A 25-year journey

To be sure, research on high-dosage, one-on-one, in-person tutoring is : It鈥檚 interventions available, offering significant improvement in students鈥 academic performance, particularly in subjects like math, reading and writing.  

But traditional tutoring is also 鈥渂reathtakingly expensive and hard to scale,鈥 said Paige Johnson, a vice president of education at Microsoft. One school district in West Texas, for example, recently spent in federal pandemic relief funds to tutor 6,000 students. The expense, Johnson said, puts it out of reach for most parents and school districts. 

We missed something important. At the heart of education, at the heart of any learning, is engagement.

Satya Nitta, IBM Research鈥檚 former global head of AI solutions for learning

For IBM, the opportunity to rebalance the equation in kids鈥 favor was hard to resist. 

The Watson lab is legendary in the computer science field, with and six Turing Award winners among its ranks. It鈥檚 where modern was invented, and home to countless other innovations such as barcodes and the magnetic stripes on credit cards that make . It鈥檚 also where, in 1997, Deep Blue beat Garry Kasparov, essentially inventing the notion that AI could 鈥渢hink鈥 like a person.

Chess enthusiasts watch World Chess champion Garry Kasparov on a television monitor as he holds his head in his hands at the start of the sixth and final match May 11, 1997 against IBM’s Deep Blue computer in New York. Kasparov lost this match in just 19 moves. (Stan Honda/Getty)

The heady atmosphere, Nitta recalled, inspired 鈥渁 very deep responsibility to do something significant and not something trivial.鈥

Within a few years of Watson鈥檚 victory, Nitta, who had arrived in 2000 as a chip technologist, rose to become IBM Research鈥檚 global head of AI solutions for learning. For the Watson project, he said, 鈥淚 was just given a very open-ended responsibility: Take Watson and do something with it in education.鈥

Nitta spent a year simply reading up on how learning works. He studied cognitive science, neuroscience and the decades-long history of 鈥渋ntelligent tutoring systems鈥 in academia. Foremost in his reading list was the research of Stanford neuroscientist Vinod Menon, who鈥檇 put elementary schoolers through a 12-week math tutoring session, collecting before-and-after scans of their brains using an MRI. Tutoring, he found, produced nothing less than an increase in neural connectivity. 

Nitta returned to his bosses with the idea of an AI-powered cognitive tutor. 鈥淭here’s something I can do here that’s very compelling,鈥 he recalled saying, 鈥渢hat can broadly transform learning itself. But it’s a 25-year journey. It’s not a two-, three-, four-year journey.鈥

IBM drafted two of the highest-profile partners possible in education: the children鈥檚 media powerhouse Sesame Workshop and Pearson, the international publisher.

One product envisioned was a voice-activated Elmo doll that would serve as a kind of digital tutoring companion, interacting fully with children. Through brief conversations, it would assess their skills and provide spoken responses to help kids advance.

One proposed application of IBM鈥檚 planned Watson tutoring app was to create a voice-activated Elmo doll that would be an interactive digital companion. (Getty)

Meanwhile, Pearson promised that it could soon allow college students to 鈥渄ialogue with Watson in real time.鈥

Nitta鈥檚 team began designing lessons and putting them in front of students 鈥 both in classrooms and in the lab. In order to nurture a back-and-forth between student and machine, they didn鈥檛 simply present kids with multiple-choice questions, instead asking them to write responses in their own words.

It didn鈥檛 go well.

Some students engaged with the chatbot, Nitta said. 鈥淥ther students were just saying, ‘IDK’ [I don鈥檛 know]. So they simply weren’t responding.鈥 Even those who did began giving shorter and shorter answers. 

Nitta and his team concluded that a cold reality lay at the heart of the problem: For all its power, Watson was not very engaging. Perhaps as a result, it also showed 鈥渓ittle to no discernible impact鈥 on learning. It wasn鈥檛 just dull; it was ineffective.

Satya Nitta (left) and part of his team at IBM鈥檚 Watson Research Center, which spent five years trying to create an AI-powered interactive tutor using the Watson supercomputer.

鈥淗uman conversation is very rich,鈥 he said. 鈥淚n the back and forth between two people, I’m watching the evolution of your own worldview.鈥 The tutor influences the student 鈥 and vice versa. 鈥淭here’s this very shared understanding of the evolution of discourse that’s very profound, actually. I just don’t know how you can do that with a soulless bot. And I’m a guy who works in AI.鈥

When students鈥 usage time dropped, 鈥渨e had to be very honest about that,鈥 Nitta said. 鈥淎nd so we basically started saying, ‘OK, I don’t think this is actually correct. I don’t think this idea 鈥 that an intelligent tutoring system will tutor all kids, everywhere, all the time 鈥 is correct.鈥

鈥榃e missed something important鈥

IBM soon switched gears, debuting another crowd-pleasing Watson variation 鈥 this time, a touching throwback: It engaged in . In a televised demonstration in 2019, it went up against debate champ Harish Natarajan on the topic 鈥淪hould we subsidize preschools?鈥 Among its arguments for funding, the supercomputer offered, without a whiff of irony, that good preschools can prevent 鈥渇uture crime.鈥 Its current iteration, , focuses on helping businesses build AI applications like 鈥渋ntelligent customer care.鈥澛

Nitta left IBM, eventually taking several colleagues with him to create a startup called . It uses voice-activated AI to safely help teachers do workaday tasks such as updating digital gradebooks, opening PowerPoint presentations and emailing students and parents. 

Thirteen years after Watson鈥檚 stratospheric Jeopardy! victory and more than one year into the Age of ChatGPT, Nitta鈥檚 expectations about AI couldn鈥檛 be more down-to-earth: His AI powers what鈥檚 basically 鈥渁 carefully designed assistant鈥 to fit into the flow of a teacher’s day. 

To be sure, AI can do sophisticated things such as generating quizzes from a class reading and editing student writing. But the idea that a machine or a chatbot can actually teach as a human can, he said, represents 鈥渁 profound misunderstanding of what AI is actually capable of.鈥 

Nitta, who still holds deep respect for the Watson lab, admits, 鈥淲e missed something important. At the heart of education, at the heart of any learning, is engagement. And that’s kind of the Holy Grail.鈥

These notions aren鈥檛 news to those who do tutoring for a living. , which offers live and online tutoring in 500 school districts, relies on AI to power a lesson plan creator that helps personalize instruction. But when it comes to the actual tutoring, humans deliver it, said , chief institution officer at , which operates Varsity.

鈥漈he AI isn’t far enough along yet to do things like facial recognition and understanding of student focus,鈥 said Salcito, who spent 15 years at Microsoft, most of them as vice president of worldwide education. 鈥淥ne of the things that we hear from teachers is that the students love their tutors. I’m not sure we’re at a point where students are going to love an AI agent.鈥

Students love their tutors. I'm not sure we're at a point where students are going to love an AI agent.

Anthony Salcito, Nerdy

The No. 1 factor in a student鈥檚 tutoring success is consistently, research suggests. As smart and efficient as an AI chatbot might be, it鈥檚 an open question whether most students, especially struggling ones, would show up for an inanimate agent or develop a sense of respect for its time.

When Salcito thinks about what AI bots now do in education, he鈥檚 not impressed. Most, he said, 鈥渁ren’t going far enough to really rethink how learning can take place.鈥 They end up simply as fast, spiffed-up search engines. 

In most cases, he said, the power of one-on-one, in-person tutoring often emerges as students begin to develop more honesty about their abilities, advocate for themselves and, in a word, demand more of school. 鈥淚n the classroom, a student may say they understand a problem. But they come clean to the tutor, where they expose, ‘Hey, I need help.’鈥

Cognitive science suggests that for students who aren鈥檛 motivated or who are uncertain about a topic, only will help. That requires a focused, caring human, watching carefully, asking tons of questions and reading students鈥 cues. 

Jeremy Roschelle, a learning scientist and an executive director of Digital Promise, a federally funded research center, said usage with most ed tech products tends to drop off. 鈥淜ids get a little bored with it. It’s not unique to tutors. There’s a newness factor for students. They want the next new thing.鈥 

There's a newness factor for students. They want the next new thing.

Jeremy Roschelle, Digital Promise

Even now, Nitta points out, research shows that big commercial AI applications don鈥檛 seem to hold users鈥 attention as well as top entertainment and social media sites like YouTube, Instagram and TikTok. dubbed the user engagement of sites like ChatGPT 鈥渓ackluster,鈥 finding that the proportion of monthly active users who engage with them in a single day was only about 14%, suggesting that such sites aren鈥檛 very 鈥渟ticky鈥 for most users.

For social media sites, by contrast, it鈥檚 between 60% and 65%. 

One notable AI exception: , an app that allows users to create companions of their own among figures from history and fiction and chat with the likes of Socrates and Bart Simpson. It has a stickiness score of 41%.

As startups like offer 鈥測our child’s superhuman tutor,鈥 starting at $29 per month, and publicly tests its popular Khanmigo AI tool, Nitta maintains that there鈥檚 little evidence from learning science that, absent a strong outside motivation, people will spend enough time with a chatbot to master a topic.

鈥淲e are a very deeply social species,鈥 said Nitta, 鈥渁nd we learn from each other.鈥

IBM declined to comment on its work in AI and education, as did Sesame Workshop. A Pearson spokesman said that since last fall it has been 鈥嬧媌eta-testing AI study tools keyed to its e-textbooks, among other efforts, with plans this spring to expand the number of titles covered. 

Getting 鈥榰nstuck鈥

IBM鈥檚 experiences notwithstanding, the search for an AI tutor has continued apace, this time with more players than just a legacy research lab in suburban New York. Using the latest affordances of so-called large language models, or LLMs, technologists at Khan Academy believe they are finally making the first halting steps in the direction of an effective AI tutor. 

Kristen DiCerbo remembers the moment her mind began to change about AI. 

It was September 2022, and she鈥檇 only been at Khan Academy for a year-and-a-half when she and founder Khan got access to a beta version of ChatGPT. Open AI, ChatGPT鈥檚 creator, had asked Microsoft co-founder Bill Gates for more funding, but he told them not to come back until the chatbot could pass an Advanced Placement biology exam.

Khan Academy founder Sal Khan has said AI has the potential to bring 鈥減robably the biggest positive transformation鈥 that education has ever seen. He wants to give every student an 鈥渁rtificially intelligent but amazing personal tutor.鈥 (Getty)

So Open AI queried Khan for sample AP biology questions. He and DiCerbo said they鈥檇 help in exchange for a peek at the bot 鈥 and a chance to work with the startup. They were among the first people outside of Open AI to get their hands on GPT-4, the LLM that powers the upgraded version of ChatGPT. They were able to test out the AI and, in the process, become amateur AI before anyone had even heard of the term. 

Like many users typing in queries in those first heady days, the pair initially just marveled at the sophistication of the tool and its ability to return what felt, for all the world, like personalized answers. With DiCerbo working from her home in Phoenix and Khan from the nonprofit鈥檚 Silicon Valley office, they traded messages via Slack.

Kristen DiCerbo introduces users to Khanmigo in a Khan Academy promotional video. (YouTube)

鈥淲e spent a couple of days just going back and forth, Sal and I, going, 鈥極h my gosh, look what we did! Oh my gosh, look what it’s saying 鈥 this is crazy!鈥欌 she told an audience during a recent at the University of Notre Dame. 

She recounted asking the AI to help write a mystery story in which shoes go missing in an apartment complex. In the back of her mind, DiCerbo said, she planned to make a dog the shoe thief, but didn鈥檛 reveal that to ChatGPT. 鈥淚 started writing it, and it did the reveal,鈥 she recalled. 鈥淚t knew that I was thinking it was going to be a dog that did this, from just the little clues I was planting along the way.鈥

More tellingly, it seemed to do something Watson never could: have engaging conversations with students.

DiCerbo recounted talking to a high school student they were working with who told them about an interaction she鈥檇 had with ChatGPT around The Great Gatsby. She asked it about F. Scott Fitzgerald鈥檚 famous , which scholars have long interpreted as symbolizing Jay Gatsby鈥檚 out-of-reach hopes and dreams.

鈥淚t comes back to her and asks, 鈥楧o you have hopes and dreams just out of reach?鈥欌 DiCerbo recalled. 鈥淚t had this whole conversation鈥 with the student.

The pair soon tore up their 2023 plans for Khan Academy. 

It was a stunning turn of events for DiCerbo, a Ph.D. educational psychologist and former senior Pearson research scientist who had spent more than a year on the failed Watson project. In 2016, Pearson that Watson would soon be able to chat with college students in real time to guide them in their studies. But it was DiCerbo鈥檚 teammates, about 20 colleagues, who had to actually train the supercomputer on thousands of student-generated answers to questions from textbooks 鈥 and tempt instructors to rate those answers. 

Like Nitta, DiCerbo recalled that at first things went well. They found a natural science textbook with a large user base and set Watson to work. 鈥淵ou would ask it a couple of questions and it would seem like it was doing what we wanted to,鈥 answering student questions via text.

But invariably if a student鈥檚 question strayed from what the computer expected, she said, 鈥渋t wouldn’t know how to answer that. It had no ability to freeform-answer questions, or it would do so in ways that didn’t make any sense.鈥 

After more than a year of labor, she realized, 鈥淚 had never seen the 鈥極K, this is going to work鈥 version鈥 of the hoped-for tutor. 鈥淚 was always at the 鈥極K, I hope the next version’s better.’鈥

But when she got a taste of ChatGPT, DiCerbo immediately saw that, even in beta form, the new bot was different. Using software that quickly predicted the most likely next word in any conversation, ChatGPT was able to engage with its human counterpart in what seemed like a personal way.

Since its debut in March 2023, Khanmigo has turned heads with what many users say is a helpful, easy-to-use, natural language interface, though a few users have pointed out that it sometimes .

Surprisingly, DiCerbo doesn鈥檛 consider the popular chatbot a full-time tutor. As sophisticated as AI might now be in motivating students to, for instance, try again when they make a mistake, 鈥淚t’s not a human,鈥 she said. 鈥淚t’s also not their friend.鈥

(AI's) not a human. It鈥檚 also not their friend.

Kristen DiCerbo, Khan Academy

Khan Academy鈥檚 shows their tool is effective with as little as 30 minutes of practice and feedback per week. But even as many startups promise the equivalent of a one-on-one human tutor, DiCerbo cautions that 30 minutes is not going to produce miracles. Khanmigo, she said, 鈥渋s not a solution that’s going to replace a human in your life,鈥 she said. 鈥淚t’s a tool in your toolbox that can help you get unstuck.鈥

鈥楢 couple of million years of human evolution鈥

For his part, Nitta says that for all the progress in AI, he鈥檚 not persuaded that we鈥檙e any closer to a real-live tutor that would offer long-term help to most students. If anything, Khanmigo and probabilistic tools like it may prove to be effective 鈥渉omework helpers.鈥 But that鈥檚 where he draws the line. 

鈥淚 have no problem calling it that, but don’t call it a tutor,鈥 he said. 鈥淵ou’re trying to endow it with human-like capabilities when there are none.鈥  

Unlike humans, who will typically do their best to respond genuinely to a question, the way AI bots work 鈥攂y digesting pre-existing texts and other information to come up with responses that seem human 鈥 is akin to a 鈥渟tatistical illusion,鈥 writes Harvard Business School Professor . 鈥淭hey鈥檝e just been well-trained by humans to respond to humans.鈥

Researcher Sidney Pressey鈥檚 1928 Testing Machine, one of a series of so-called 鈥渢eaching machines鈥 that he and others believed would advance education through automation.

Largely because of this, Nitta said, there鈥檚 little evidence that a chatbot will continuously engage people as a good human tutor would.

What would change his mind? Several years of research by an independent third party showing that tools like Khanmigo actually make a difference on a large scale 鈥 something that doesn鈥檛 exist yet.

DiCerbo also maintains her hard-won skepticism. She knows all about the halting early decades of computers a century ago, when experimental, punch-card operated 鈥渢eaching machines鈥 guided students through rudimentary multiple-choice lessons, often with simple rewards at the end. 

In her talks, DiCerbo urges caution about AI revolutionizing education. As much as anyone, she is aware of the expensive failures that have come before. 

Two women stand beside open drawers of computer punch card filing cabinets. (American Stock/Getty Images)

In her recent talk at Notre Dame, she did her best to manage expectations of the new AI, which seems so limitless. In one-to-one teaching, she said, there鈥檚 an element of humanity 鈥渢hat we have not been able to 鈥 and probably should not try 鈥 to replicate in artificial intelligence.鈥 In that respect, she鈥檚 in agreement with Nitta: Human relationships are key to learning. In the talk, she noted that students who have a person in school who cares about their learning have higher graduation rates. 

But still.

ChatGPT now has 100 million weekly users, according to . That record-fast uptake makes her think 鈥渢here’s something interesting and sticky about this for people that we haven’t seen in other places.鈥

Being able to engineer prompts in plain English opens the door for more people, not just engineers, to create tools quickly and iterate on what works, she said. That democratization could mean the difference between another failed undertaking and agile tools that actually deliver at least a version of Watson鈥檚 promise.聽

An early prototype of IBM鈥檚 Watson supercomputer in Yorktown Heights, New York. In 2011, the system was the size of a master bedroom. (Wikimedia Commons)

Seven years after he left IBM to start his new endeavor, Nitta is philosophical about the effort. He takes virtually full responsibility for the failure of the Watson moonshot. In retrospect, even his 25-year timeline for success may have been naive.

鈥淲hat I didn’t appreciate is, I actually was stepping into a couple of million years of human evolution,鈥 he said. 鈥淭hat’s the thing I didn’t appreciate at the time, which I do in the fullness of time: Mistakes happen at various levels, but this was an important one.鈥

]]>
Exclusive: For Busy Teachers, AI Could Crack Open the Dense World of Ed Research /article/exclusive-phonics-learning-styles-teachers-confounded-by-education-research-may-soon-turn-to-new-ai-chatbots-for-help/ Wed, 06 Sep 2023 11:15:00 +0000 /?post_type=article&p=714153 As students across the U.S. enter their first full school year with access to powerful AI tools like ChatGPT and Bard, many educators remain skeptical of their usefulness 鈥 and preoccupied with their potential to .

But this fall, a few educators are quietly charting a different course they believe could change everything: At least two groups are pushing to create new AI chatbots that would offer teachers unlimited access to sometimes confusing and often paywalled peer-reviewed research on the topics that most bedevil them. 

Their aspiration is to offer new tools that are more focused and helpful than wide-ranging ones like ChatGPT, which tends to stumble over research questions with competing findings. And like many kids faced with questions they can’t answer, it has a frustrating tendency to make things up.


Get stories like this delivered straight to your inbox. Sign up for 社区黑料 Newsletter


Tapping into curated research bases and filtering out lousy results would also make the bots more reliable: If all goes according to plans, they鈥檇 cite their sources.

The result, supporters say, could revolutionize education. If their work takes hold, millions of teachers for the first time could routinely access high-quality research and make it part of their everyday workflow. Such tools could also help stamp out adherence to stubborn but ill-supported fads in areas from 鈥渓earning styles鈥 to reading instruction.

So far, the two groups are each feeling their way around the vast undertaking, with slightly different approaches.

In June, the International Society for Technology in Education introduced , a tool built on content vetted by ISTE and the Association for Supervision and Curriculum Development. (The two groups merged in 2022.) ISTE has made it available in to selected users. All of the chatbot鈥檚 content is educator-focused, and it鈥檚 trained solely on materials developed or approved by the two organizations. 

Richard Culatta

Now its creators say that within about six months, they expect that the tool will also be able to scour outside, peer-reviewed education research and return 鈥減retty understandable, pretty meaningful results鈥 from vetted journals, said Richard Culatta, ISTE鈥檚 CEO.

鈥淭here’s this big gap between what we know in the research and what happens in practice,鈥 he said. One reason: Most research is published in a format that 鈥渋s just totally inaccessible to teachers.鈥

Case in point: A set of by the Jefferson Education Exchange, a nonprofit supported by the University of Virginia鈥檚 Curry School of Education, found that while educators prefer research they can act on 鈥 and that鈥檚 presented in a way that applies to their work 鈥 only about 16% of teachers actually use research to inform instruction.

So he and others are building a digital tool, 鈥減urpose-built for educators by educators,鈥 that can translate research into practice, using 鈥渧ery practical language that teachers understand.鈥

For instance, a teacher could ask the chatbot, 鈥淲hat does the research say about creating a healthy school culture?鈥 or 鈥淲hat鈥檚 the evidence for teaching phonics to developing readers?鈥 One could also ask it to suggest activities that are appropriate for middle school students learning about digital citizenship.

Joseph South, ISTE鈥檚 chief learning officer, said teachers want the latest research, but are up against formidable obstacles. 鈥淭hey have to find the article in the journal that happens to relate to the thing that they want to do,鈥 he said. 鈥淭hey have to somehow understand academic-speak. They have to have the time to read this, and they have to translate it into something useful.鈥

While ChatGPT can comb through journals it has access to, translate and summarize the research, he said, it鈥檚 not reliable. The typical chatbot 鈥 and thus the typical end user 鈥 doesn’t know whether the results are from a credible, peer-reviewed journal or not, and it may not necessarily care.

Joseph South

鈥淲e do, though,鈥 he said. 鈥淪o we can do that filtering and let the AI do its magic.鈥

As with its beta version, the new chatbot will also cite the sources used to generate each response. And it鈥檒l let users know when it simply doesn鈥檛 have enough information to return a reliable response.

Developers are still in the early stages of deciding what academic journals to include. For now, they鈥檙e experimenting with a handful of key research articles, but will expand the chatbot鈥檚 range if initial prototypes prove helpful to educators.

Culatta and South, both veterans of the U.S. Department of Education, have spent years working on the research-to-practice problem, offering, in effect, translation services for research findings. 鈥淲e’ve spent so much work trying to figure out how to do it and it’s just never really worked,鈥 he said. 鈥淚t’s just always been a struggle. And we actually think that this could be the first for-real, sustainable, scalable approach to taking research and getting it into language that actually could be used by teachers.鈥

Daniel Willingham

, a professor of psychology at the University of Virginia and a well-known translator of education research, said his limited experience with ChatGPT has shown that when asked about a subject where there鈥檚 general consensus, such as “What is the effect of sleep on memory?” it produces helpful results. But it isn鈥檛 very good at synthesizing conflicting findings.

It鈥檚 also inconsistent in its willingness to reveal, in Willingham鈥檚 words, that 鈥溾業 really don’t know anything about that.鈥 And so it, you know, just .鈥

A paid ChatGPT subscriber, Willingham said he gets 鈥渞eally useful鈥 results only about 20% of the time. 鈥淏ut it requires plenty of verification on my part. And this is all within my area of expertise, so it’s not very hard for me to verify.鈥

Tapping 鈥榃hat Works鈥

ISTE isn鈥檛 the only organization pushing to make education research more widely accessible via chatbot. The Learning Agency, a Washington, D.C.-based consulting firm, is also testing a of a bot designed to offer answers to education research queries.

Unlike ISTE鈥檚, the agency鈥檚 tool taps an already existing, if finite, resource: the U.S. Department of Education鈥檚 What Works Clearinghouse, or more specifically its , a curated collection of materials developed by the department鈥檚 Institute of Education Sciences.

鈥淲e were inspired to basically create a special version of ChatGPT that was exposed to more high-quality educational data and research evidence on what works,鈥 said Perpetual Baffour, the group鈥檚 research director.

In a sense, she said, much of the work had already been done, since the library, though limited, exists to translate research findings into more digestible forms for educators. The result is a prototype that offers what Baffour calls 鈥渆vidence-based education advice鈥 on topics from adolescent literacy to dropout prevention and school turnarounds.

Perpetual Baffour

Baffour noted that the app development tool allowed the agency to create a 鈥渟imple but robust鈥 prototype within a day.

At the moment, the version they鈥檙e testing is 鈥渧ery basic,鈥 Baffour said. The agency is still considering what it鈥檒l look like in the future. 

鈥淚t only knows as much as the Doing What Works Library tells it,鈥 she said. So queries about topics that are in the library return rich results. But asking it about topics that aren鈥檛 can be problematic. 

For example, ask it about myths around the aforementioned learning styles and it defaults to a more general knowledge base scraped from Wikipedia articles, transcripts of recorded conversations and materials from 鈥渄ifferent corners of the Internet,鈥 Baffour said. 鈥淎nd as you can imagine, those sources might not have the most up-to-date and accurate information about education,鈥 Baffour said.

Indeed, a query about the topic returns this: 鈥淭he myth concerning learning styles is that there is one specific style that works best for everyone. This is not true, as different people learn in different ways and have different preferences for how they learn. Additionally, there is no evidence that suggests that focusing on one particular learning style is more effective than focusing on multiple styles.鈥

Not exactly accurate or helpful.

In the first place, the widely believed 鈥渕yth鈥 holds that people with different learning styles learn best when their preferred mode of learning is indulged 鈥 not that one style works best for everyone. At a more basic level, while many people may express preferences for ways to take in new information and study 鈥 receiving instruction verbally, for example, instead of via pictures 鈥 scientists have yet to find good evidence that material tuned to these preferences . 

Unfortunately, at the moment the agency鈥檚 bot doesn鈥檛 confess whether it knows a lot or little about a topic. Baffour said they want to change that soon. For now, however, that鈥檚 just an aspiration.

鈥淚 think you’re more likely to get a confident chatbot producing inaccurate information than you are to get a self-aware chatbot admitting its false and incomplete knowledge,鈥 she said. 

Willingham, the UVA researcher, said a useful education-focused chatbot would not just have to incorporate reliable findings, but put them in context. For example, an answer to a query about the evidence for phonics instruction would properly note that, while the record is fairly strong, a lot of mediocre research and 鈥渉yperbolic claims鈥 made in support of alternative methods serve to cloud the overall picture 鈥 a delicate but accurate detail.

鈥淗ow is an aggregator going to negotiate that?鈥 he said. 

Asked if he thought a chatbot might soon replace him, Willingham, the author of and a that translate learning science into plain English, said he wouldn鈥檛 make any predictions. 

鈥淚 was never much of a futurist, but I hocked my crystal ball 15 years ago,鈥 he said.

]]>