felix089 6 hours ago

How did you structure the dataset for FT? Reminds me of: https://rosslazer.com/posts/fine-tuning/

  • jonpizza an hour ago

    I chunked my conversations by day so that each conversation in the dataset would be about the same topic throughout without random switching, which isn't perfect, ideally I would let an LLM chunk the conversations into logical start/stop points, but I didn't want to spend all that money on tokens. I also got rid of any conversations with images and group chat conversations to simplify.

ungreased0675 11 hours ago

Have you tried talking to yourself with this? Were there any unexpected insights?

  • jonpizza an hour ago

    I asked "myself" what my greatest fear was and it actually gave me an accurate answer. Then I asked it again and it said "Clowns". I don't think it was particuarly insightful, but it was slightly eeire. The tone and style are kinda spot on tbh, though the content is generally incorrect.

bn-l 12 hours ago

I tried it out. Were many of the texts about hooking up and “cuties”?

  • jonpizza 11 hours ago

    Not really. There are some, but I do think generally my conversations kinda biased the model over towards adjacent topics. I also included something in the system message along those lines ("Hi, I have a cute girl I want you to meet") for a sample response to a basic input like "Hi" just to make the conversation more interesting.