Shared Chat
What is RLHB?

RLHB stands for "Learning from Human Behavior" . It is a concept where models and algorithms are trained on data from human behavior to make predictions and analyze user behavior . This approach allows for understanding patterns and making predictions based on observed behavior .

One application of RLHB is in product analytics, where models are used to analyze user behavior data and make predictions about user actions and outcomes . By understanding user behavior patterns, companies can optimize their products and make data-driven decisions .

First-party behavior data is particularly valuable in RLHB because it provides unique insights into user behavior that cannot be found elsewhere . This type of data can help build accurate models and improve predictions .

One challenge in RLHB is the interpretation of data, especially in natural language form . While natural language interfaces can be a great way to interact with data, the precision of natural language can sometimes be challenging . It is important to ensure that the semantics and meanings of queries are accurately captured to obtain reliable insights .

Overall, RLHB holds potential for improving predictive models, understanding user behavior, and making data-driven decisions. However, further advancements and research are needed to fully leverage the power of learning from human behavior .

1
(someone): You know, how many do you get them to come back and search in the future? Like, that's much more interesting than the time of your session. And so because, you know, each time you can serve ads, right, that's your business. And so it's like, if you choose a metric that's well correlated with your business outcomes, then that's at least the first step to getting that right and not getting caught up in other vanity metrics that sound like they could be good to increase, but then they can sometimes lead to negative business outcomes. And then you get the worst. You've optimized the wrong metric the whole time.
(someone): And that's where tying in AI and product analytics makes a lot of sense. And it's really important because product analytics, these companies that are like our customers that are trying out building features that are LMs and they're not sure what to optimize for, Optimize for the same thing you're already optimizing for. You're already measuring conversions, you're measuring how much value, hopefully, your customers are getting out of your product. So continue doing that and maybe find a way to tie the LLM feature to that and sort of through A-B tests and that sort of thing. And then on the chat, chat specifically, chat is obviously for a business maybe rolling out a chat box based on LLMs, it can be really scary. And that's another sort of mental model of framing we've been thinking around is,
2
(someone): Yeah, I'll do the one that we actually repeat in inside amplitude very often, not about AI, but I think it applies, which is it's early. Um, it's sometimes hard to realize that when things are happening so fast, especially in the Bay area, but like the ramifications of AI or in our, in our case, product data and all that are going to play out over the next many decades. And that's just, you know, we're, we're very. You know, fortunate to be at the beginning of it. Um, and so yeah, take advantage of it and keep reminding yourself that, that, that it's early.
(someone): I guess mine would be let humans be good at doing human things. Let machines be good at doing machine things and let machines be good at doing machine things and help humans be good at doing human things. And like, if you don't do that, then you're going to be building something that's either not useful or it's very scary. So, uh, yeah, getting machines, helping humans, not the other way around.
swyx: Good machines, helping humans. All right. Uh, with that, I think we're all going to open up to questions. Yeah, go ahead. We're going to toss you the mic.
(someone): Hey, um, thanks for the insight into how you guys implemented your AI question, asking a chat bot and how have you converted into seven sub queries and then generate the data out. Yeah.
3
Alessio Fanelli: What isn't? Any thoughts there?
(someone): Yeah, I think, I think it's safe to say that both are really important, right? Like the evolution of, of LLMs really was a lot of, a lot of model innovation. Um, and so I don't want to downplay that at the same time. I think the future of AI applications and doing really cool things with it will be in the data. Um, partially because like, you know, chat GPT has done such a huge advance, right? The LLMs model space has advanced like crazy in the last year. And so I think a lot of the untapped potential will be in data in the future. One thing that's particularly interesting to us is we have a pretty unique data set, actually. It's a lot of first-party behavior data. So if you're Square, for example, you instrumented the way that people interact with Square Cash and the wallet and the checkout system. And those are very specific things. Square can't look elsewhere in the world for that stuff. And that's really interesting because you know, to build models of user behavior, you need user behavior data. And it turns out there's not actually a lot of examples of user behavior data out there in the world. And so to Joy's point earlier about, you know, we think we have one of the best user behavior data sets in the world.
4
(someone): What was different about them? And so we have a bunch of heuristics to do that. But at the end, there's something like, you know, causal impact is like one of the holy grails of product analytics. It's like, what was the causation behind some observed difference in behavior? And I think, yeah, a large behavioral model will be much better at assessing that and be able to, you know, give you potentially interpretable ways of answering that question that are really hard to do, really hard, really computationally intensive, really noisy. Distilling causation correlation is obviously super hard. Those are some of the examples. The other one that I am, I don't know if I'm optimistic about it, but would be really interesting is One of the things that amplitude requires today is manual instrumentation. You have to decide, hey, this clicking of a button, this viewing of the page, these are important things. I'm naming them in this way. There's a lot of popular tools out there that just record user sessions or track DOM events automatically. You know, there's a lot of problems with those tools because the data is incredibly noisy. It's just so noisy, right? You just, a lot of times you just can't actually interpret it. And so it's like, Oh, it's great.
5
(someone): It's all language stuff. But it's a little bit slow and a little bit expensive to do that every time. Once we validated that that works, we fell back to a more traditional embedding-based approach. It's like, all right, compute all those embeddings. That's more work up front. because you have to go through your database of all of these things and you've got to commit that engineering work, but you validate with the general model because it's just easy. It takes an hour to figure out that it works. And then it's like, all right, can we do that same thing with embeddings that's way faster, way cheaper, and still has reasonable quality? Embeddings also have a nice quality that you can get magnitude of things, whereas LLMs aren't great at giving you like, hey, it matches this much. You can ask it for an order and that's decent, but anything beyond that is pretty challenging.
Alessio Fanelli: How do you think about the importance of the model versus the data, right? There's like a lot of companies that have a lot of data, but not a lot of AI expertise or companies that are just using off the shelf model. How should companies think about how much data to collect? What data is meaningful? What isn't? Any thoughts there?
6
(someone): You just, a lot of times you just can't actually interpret it. And so it's like, Oh, it's great. Cause it's not only do any work, but like, well, you also don't get anything out of it. Um, it's possible that a behavioral model would be able to actually understand what's going on there by, you know, by understanding your user behavior in a correctly modeled. and correctly labeled sense and then figuring out. I don't know if that's possible. I think that would make everyone's lives a lot easier. If you could somehow ask behavioral questions of data without having to instrument, you know, all of our customers would love that. Um, but also all of them are instrumenting because they know that's definitely not possible today.
(someone): This is really interesting. Looking forward to the future. If you're going to build it, it's going to be amazing. Yeah.
(someone): That's the goal. That's the goal. Awesome.
swyx: And thanks for listening. Bye.
7
(someone): maybe creativity is the best word where it's, you know, with the image generation stuff, text generation, you know, the one thing that still blows my mind, I used to be a competitive, like math guy. And like, there's this international math Olympiad problem in one of the papers, and it solves it. And I'm just like, wow, I can solve this when I was spending all my life doing this thing. Like that level of creativity. really blew my mind. And what's the takeaway? It's like, maybe the takeaway is that creativity is not as, you know, as not as high entropy or high dimensional as we think it is, which kind of interesting takeaway. But yeah, that one definitely surprised me.
(someone): I guess there's something actually that that maybe answering the inverse question that a lot of my friends were surprised happened quickly. And I was like, this is just branded obvious. I've got a lot of friends in the AI safety space. So they're worried that in particular, x-risk, right, extension risk, that AI is going to kill the human race. And they were like, oh no, what if an AI escapes containment and gets access to the internet? And then we get an LLM and the first thing we do is like, hey, AutoGPT, here's the internet.
swyx: You thought it's happening faster than you thought.
8
(someone): On the one end, it's like literally track nothing and the end of story. And for people like that, I mean, that's cool. They're not going to use Amplitude. They may not like us very much. That is what it is. And then on the other end of the spectrum is like, we're going to track you across the entire internet and sell your data to everyone. And like, that's obviously bad. And like, there's lots of good reasons to think that's bad. First party behavioral data, I think is actually probably almost as far. Yeah, okay. Fully anonymized first party behavior data would be like, kind of the minimum. It's like, web server logs with no IP, no identifier, nothing. And like, the problem is that you can't do a lot of interesting behavioral analysis without that. You can't tell if, you know, this person that came on this day was the same one that purchased later. And so like, you can't actually, it's much harder to make your product better if you don't have that. And so, you know, we're kind of set at this place where we have, you know, like pseudo anonymized first party data. And like, we don't, we don't sell the data.
9
(someone): Because the precise semantics are not captured by your ambiguous natural language words. And so the way we think about it, at least today, you know, who knows what's going to change in the future is like natural language is a great interface to like get started. If you don't know what the underlying data looks like, if you don't know like what questions you should be asking, It is a very, very expressive way to start get started. It's much easier than manipulating a bunch of things much, much easier than writing SQL and all of that. Like once you kind of know what you want, it's very hard to like make it precise. It's actually easier to make SQL or code precise than it is natural language. And so that's a little bit of what we're thinking right now. So we think, you know, For sure, the way that maybe many people interface with analytics and data will turn into natural language because maybe the precision doesn't matter to them. But at the end of the day, when you're trying to sum up your revenue or something, you want to know that it's right. You want to know the semantics that go into that. That's part of why data is hard. The semantics really do matter. They can make a huge difference in the output. There's a boundary there that I'm curious where it will push over time, but I don't think it's quite there yet.
10
(someone): So that cohort might be, you know, people will churn, people will purchase, people upsell, you know, whatever the customer wants. Um, we think it would be much better at tasks like that. Because if it just has a very good understanding of behavioral patterns and what's going to come next, it would be able to do that. That's exciting, but not that exciting. If I'm trying to think about the analogies to what we see in LLMs, it's like, okay, what is the behavioral equivalent of learning physics concepts? It's like, I don't actually know, but it might be this understanding of you know, patterns of sessions and how that, like, for example, categorizing users in a unsupervised way, seems like a very simple output for a model that understands user behavior, right? Here's all the users. And, you know, if you want to, you know, discriminate them by their ability to achieve some outcome in the future, like here's the best way to separate that group. And here's why, right? Be able to explain at that level. And like, that would be super powerful for customers, right? A lot of times what our customers do is, hey, these people came back the next day and these people didn't. Why? What was different about them? And so we have a bunch of heuristics to do that.
11
Alessio Fanelli: What isn't? Any thoughts there?
(someone): Yeah, I think, I think it's safe to say that both are really important, right? Like the evolution of, of LLMs really was a lot of, a lot of model innovation. Um, and so I don't want to downplay that at the same time. I think the future of AI applications and doing really cool things with it will be in the data. Um, partially because like, you know, chat GPT has done such a huge advance, right? The LLMs model space has advanced like crazy in the last year. And so I think a lot of the untapped potential will be in data in the future. One thing that's particularly interesting to us is we have a pretty unique data set, actually. It's a lot of first-party behavior data. So if you're Square, for example, you instrumented the way that people interact with Square Cash and the wallet and the checkout system. And those are very specific things. Square can't look elsewhere in the world for that stuff. And that's really interesting because you know, to build models of user behavior, you need user behavior data. And it turns out there's not actually a lot of examples of user behavior data out there in the world. And so to Joy's point earlier about, you know, we think we have one of the best user behavior data sets in the world.
12
(someone): Yeah, I'll do the one that we actually repeat in inside amplitude very often, not about AI, but I think it applies, which is it's early. Um, it's sometimes hard to realize that when things are happening so fast, especially in the Bay area, but like the ramifications of AI or in our, in our case, product data and all that are going to play out over the next many decades. And that's just, you know, we're, we're very. You know, fortunate to be at the beginning of it. Um, and so yeah, take advantage of it and keep reminding yourself that, that, that it's early.
(someone): I guess mine would be let humans be good at doing human things. Let machines be good at doing machine things and let machines be good at doing machine things and help humans be good at doing human things. And like, if you don't do that, then you're going to be building something that's either not useful or it's very scary. So, uh, yeah, getting machines, helping humans, not the other way around.
swyx: Good machines, helping humans. All right. Uh, with that, I think we're all going to open up to questions. Yeah, go ahead. We're going to toss you the mic.
(someone): Hey, um, thanks for the insight into how you guys implemented your AI question, asking a chat bot and how have you converted into seven sub queries and then generate the data out. Yeah.
13
(someone): You know, how many do you get them to come back and search in the future? Like, that's much more interesting than the time of your session. And so because, you know, each time you can serve ads, right, that's your business. And so it's like, if you choose a metric that's well correlated with your business outcomes, then that's at least the first step to getting that right and not getting caught up in other vanity metrics that sound like they could be good to increase, but then they can sometimes lead to negative business outcomes. And then you get the worst. You've optimized the wrong metric the whole time.
(someone): And that's where tying in AI and product analytics makes a lot of sense. And it's really important because product analytics, these companies that are like our customers that are trying out building features that are LMs and they're not sure what to optimize for, Optimize for the same thing you're already optimizing for. You're already measuring conversions, you're measuring how much value, hopefully, your customers are getting out of your product. So continue doing that and maybe find a way to tie the LLM feature to that and sort of through A-B tests and that sort of thing. And then on the chat, chat specifically, chat is obviously for a business maybe rolling out a chat box based on LLMs, it can be really scary. And that's another sort of mental model of framing we've been thinking around is,
14
(someone): So how do we monitor this? We can't just say one performance metric. You know, RLHF, you can't just say yes or no. Was the query response good? Because it might have failed for one of seven reasons. And maybe multiple of them failed, or maybe some of them failed. And then maybe they've hallucinated. And so we're getting code errors where an enum is not being matched. So we've had lots of sort of issues going all the way down there. We've had to figure out from first principles and sort of a really exciting way for us to understand what our customers are going through. Yeah.
swyx: You've described your exploration and how you think about products. What have you released so far? I just want to get an idea of what has been shipped.
(someone): Sure, so in terms of LLM stuff, this, we call it question to chart internally, this ask a question, get a chart out. This, we've started rolling out to customers already. So last week, actually, started rolling out to our AI design partners, a sign that we had signed up, which is a really exciting process. Actually, a lot of customers are just so excited to work with us and try it out and see how they can break it. So that's something we rolled out recently, which is built in LLMs, the first piece built on LLMs that we work on.
15
(someone): What was different about them? And so we have a bunch of heuristics to do that. But at the end, there's something like, you know, causal impact is like one of the holy grails of product analytics. It's like, what was the causation behind some observed difference in behavior? And I think, yeah, a large behavioral model will be much better at assessing that and be able to, you know, give you potentially interpretable ways of answering that question that are really hard to do, really hard, really computationally intensive, really noisy. Distilling causation correlation is obviously super hard. Those are some of the examples. The other one that I am, I don't know if I'm optimistic about it, but would be really interesting is One of the things that amplitude requires today is manual instrumentation. You have to decide, hey, this clicking of a button, this viewing of the page, these are important things. I'm naming them in this way. There's a lot of popular tools out there that just record user sessions or track DOM events automatically. You know, there's a lot of problems with those tools because the data is incredibly noisy. It's just so noisy, right? You just, a lot of times you just can't actually interpret it. And so it's like, Oh, it's great.
16
(someone): I'll go first. I'll go first. I think the first thing that needs to happen here is the community will actually get the model into its hands and find out its true capabilities. Benchmarks only take us so far. Once that has happened, we're going to see an extensive sort of period of fine tuning where people are going to apply it to their particular applications and keep pushing the envelope here. And then if it is sufficiently capable, I actually think that we might find new uses for these models that we don't find in REST API served ones, because you can get at the internal state, right? The thing that I'm always thinking about obviously is embeddings and internal states and like modifications here. And I think that there's actually a great deal of interesting research and engineering to be done by looking into what's happening in these models live, especially a sufficiently capable one, which we can do reasoning. And so I'm particularly excited about that. I'm particularly excited about having something at least sufficiently capable that we can start to reason about because the entire research community has access to it rather than behind a closed wall inside some of the bigger AI labs.
(someone): on how remarkable the collapse of kind of NLP research as it was has been onto OpenAI APIs. And this is an opportunity to reset some of that dynamic, where so much academic work was just fine tuning OpenAI models. And I was like, Oh, sorry, we nuked all your fine tuned models and things like that.
17
swyx: Like there should be a ton of people buying ads specifically on Google Maps. So they just show up and I don't know how big that business is, but it's got to be huge. And then my subsequent thing is like there should be Google Maps optimization where you would name your business like best barbershop and it would show up as best barbershop when you look at it.
(someone): Yeah, of course. Right. Yeah. It's like AAA lock picks. Yeah. right at the front of the yellow pages. Favorite AI people and communities you want to shout out? You know, I don't think that I have necessarily anything super original to say on this front. The best of my understanding, this is an all-volunteer effort, and it's, you know, incredible what they have been able to accomplish. And it's like kind of in the constellation of projects, you know, that additionally, I think these are what you would say in answer to in response to this question. I think like the Hugging Face group is it's kind of like Google Maps in a way in the sense that you like forget how complicated the thing that it's doing is. And I think they have, you see like specific people, I was thinking of Stas, who works on a lot of the deep speed stuff, just super conscientious and like engaged with the community and like that the entire team at Hugging Face is incredible. And they have made a lot of what is happening possible. in the industry at large.
18
(someone): It's all language stuff. But it's a little bit slow and a little bit expensive to do that every time. Once we validated that that works, we fell back to a more traditional embedding-based approach. It's like, all right, compute all those embeddings. That's more work up front. because you have to go through your database of all of these things and you've got to commit that engineering work, but you validate with the general model because it's just easy. It takes an hour to figure out that it works. And then it's like, all right, can we do that same thing with embeddings that's way faster, way cheaper, and still has reasonable quality? Embeddings also have a nice quality that you can get magnitude of things, whereas LLMs aren't great at giving you like, hey, it matches this much. You can ask it for an order and that's decent, but anything beyond that is pretty challenging.
Alessio Fanelli: How do you think about the importance of the model versus the data, right? There's like a lot of companies that have a lot of data, but not a lot of AI expertise or companies that are just using off the shelf model. How should companies think about how much data to collect? What data is meaningful? What isn't? Any thoughts there?
19
(someone): You just, a lot of times you just can't actually interpret it. And so it's like, Oh, it's great. Cause it's not only do any work, but like, well, you also don't get anything out of it. Um, it's possible that a behavioral model would be able to actually understand what's going on there by, you know, by understanding your user behavior in a correctly modeled. and correctly labeled sense and then figuring out. I don't know if that's possible. I think that would make everyone's lives a lot easier. If you could somehow ask behavioral questions of data without having to instrument, you know, all of our customers would love that. Um, but also all of them are instrumenting because they know that's definitely not possible today.
(someone): This is really interesting. Looking forward to the future. If you're going to build it, it's going to be amazing. Yeah.
(someone): That's the goal. That's the goal. Awesome.
swyx: And thanks for listening. Bye.
20
(someone): On the one end, it's like literally track nothing and the end of story. And for people like that, I mean, that's cool. They're not going to use Amplitude. They may not like us very much. That is what it is. And then on the other end of the spectrum is like, we're going to track you across the entire internet and sell your data to everyone. And like, that's obviously bad. And like, there's lots of good reasons to think that's bad. First party behavioral data, I think is actually probably almost as far. Yeah, okay. Fully anonymized first party behavior data would be like, kind of the minimum. It's like, web server logs with no IP, no identifier, nothing. And like, the problem is that you can't do a lot of interesting behavioral analysis without that. You can't tell if, you know, this person that came on this day was the same one that purchased later. And so like, you can't actually, it's much harder to make your product better if you don't have that. And so, you know, we're kind of set at this place where we have, you know, like pseudo anonymized first party data. And like, we don't, we don't sell the data.
21
(someone): maybe creativity is the best word where it's, you know, with the image generation stuff, text generation, you know, the one thing that still blows my mind, I used to be a competitive, like math guy. And like, there's this international math Olympiad problem in one of the papers, and it solves it. And I'm just like, wow, I can solve this when I was spending all my life doing this thing. Like that level of creativity. really blew my mind. And what's the takeaway? It's like, maybe the takeaway is that creativity is not as, you know, as not as high entropy or high dimensional as we think it is, which kind of interesting takeaway. But yeah, that one definitely surprised me.
(someone): I guess there's something actually that that maybe answering the inverse question that a lot of my friends were surprised happened quickly. And I was like, this is just branded obvious. I've got a lot of friends in the AI safety space. So they're worried that in particular, x-risk, right, extension risk, that AI is going to kill the human race. And they were like, oh no, what if an AI escapes containment and gets access to the internet? And then we get an LLM and the first thing we do is like, hey, AutoGPT, here's the internet.
swyx: You thought it's happening faster than you thought.
22
(someone): All of those kinds of things. My hunch is that it is going to turn out to be extremely good. I doubt that it'll turn out to be a damp squib on that front. But yeah, so they've released it. It's available commercially and you are allowed to redistribute it, but the only way to officially get the weights is to fill in a form on their website and wait for them to approve you still, which is kind of stupid because obviously it's already started leaking. I downloaded a version onto my laptop this afternoon, which worked. there's a ggml and the bloke thing that's floating around on Hugging Face already. So, you know, within 24 to 48 hours, I think every possible version of this thing will be available to download without going through a waiting list. I'm almost not sure why they even bother with that, especially since, you know, Llama leaked within within a few days last time, and somebody ended up submitting a pull request to the GitHub readme with a link to the BitTorrent for the Llama models, which Facebook didn't delete, you know, they didn't sort of, they kind of like nodded and winked and said, yeah, this is what you can do. And now it's even legitimately okay to do it because the license says you can. But anyway, it's out there. You can run it on your computer right now, today. It's also hosted in a bunch of places.
23
(someone): So that cohort might be, you know, people will churn, people will purchase, people upsell, you know, whatever the customer wants. Um, we think it would be much better at tasks like that. Because if it just has a very good understanding of behavioral patterns and what's going to come next, it would be able to do that. That's exciting, but not that exciting. If I'm trying to think about the analogies to what we see in LLMs, it's like, okay, what is the behavioral equivalent of learning physics concepts? It's like, I don't actually know, but it might be this understanding of you know, patterns of sessions and how that, like, for example, categorizing users in a unsupervised way, seems like a very simple output for a model that understands user behavior, right? Here's all the users. And, you know, if you want to, you know, discriminate them by their ability to achieve some outcome in the future, like here's the best way to separate that group. And here's why, right? Be able to explain at that level. And like, that would be super powerful for customers, right? A lot of times what our customers do is, hey, these people came back the next day and these people didn't. Why? What was different about them? And so we have a bunch of heuristics to do that.
Unknown error occured.