newslettering

2023-05-25 17:47:20 -07:00 · 2023-05-25 17:47:20 -07:00 · 19ec01baa3
parent 596bd869d2
commit 19ec01baa3
1 changed files with 25 additions and 1 deletions
--- a/newsletter6.org
+++ b/newsletter6.org
@ -31,4 +31,28 @@
     + a thought experiment about false positives
   + The future of GPT-likes, a prediction
 ** Text
-Hi everyone, it's been months since my last newsletter but there's a lot to cover as we're getting back to talking about what's been happening in the world of large language models (LLMs)
+Hi everyone, it's been months since my last newsletter but there's a lot to cover as we're getting back to talking about what's been happening in the world of large language models (LLMs).
+
+So, obviously, a lot of this is going to be about chatGPT. That's unavoidable. It's the big flashy product that's growing to tens and tens of millions of users, over a hundred million if their hype is to be believed, and it's the main thing that everyone is talking about. It was supposed to change the world: but has it?
+
+When it comes to education, certainly the fear of chatGPT has already had a large impact. Now that we're reaching the end of the school year here in the US we're seeing a lot of newstories crop up about "AI detection tools" getting used to flunk swaths of students.
+
+The biggest story was about Texas A&M-Commerce: https://www.nbcdfw.com/news/local/tamu-commerce-instructor-accuses-class-of-using-chatgpt-on-final-assignments/3260731/
+
+But I've also been finding stories like this showing up on reddit a lot: https://www.reddit.com/r/ChatGPT/comments/13qt26p/my_english_teacher_is_defending_gpt_zero_what_do/ (with regard to comments, caveat lector)
+
+So the Texas A&M story is just wild, right? Like the professor, who apparently is a 30 y/o who just got his PhD a couple of years ago so this isn't a tenured boomer on his way out, is pasting in answers into chatGPT and asking "did you write this?" which is an absolutely non-sensical thing to do. As I'm fond of quoting Pauli: it's not even wrong.
+
+But the stories involving tools like gptzero are more interesting because they are, at least in principle, capable of detecting whether text is generated by gpt3/3.5/4. So what does it mean when almost every student in a class is getting accused of plagiarism? Well this is an example of what sometimes is called the "false positive paradox" which itself is an example of "base rate fallacy": https://en.wikipedia.org/wiki/Base_rate_fallacy
+
+The idea is pretty simple: let's imagine, just to make the numbers look nice, when zerogpt flags an essay as generated text it's right 95% of the time and the other 5% is a false positive. Now that sounds pretty good, right? But the problem is that we have to worry about how many cheaters there are in the first place! Imagine a class of two hundred students where fully half of them cheating. So then you're going to get 100*0.95 = 95 cheaters correctly flagged and 100*0.05 = 5 non-cheaters incorrectly flagged. Well, duh, that's what we expect from the 95% accuracy, right?
+
+Now consider, what if out of that two hundred only 10 of them are cheaters? Then you're going to get, roughly, all ten of those cheaters identified but also 190*0.05 = 9.5 (rounding up) another ten students who aren't cheaters flagged as well.
+
+So the efficacy of a tool like zerogpt is almost entirely dependent on whether you think that cheating is prevelant---but even in the best case scenario you will be punishing students who are not cheating!
+
+Consider, then, 
+
+
+
+