Automated Listing Descriptions with

· (7 minutes)
Article's Header

The Story Behind

From 2016 to 2019, when I led the data science team at Enodo, we aggregated data from millions of apartment listings, pulled out unit attributes like beds, baths, square footage and amenities, and trained machine learning algorithms on that data to predict the rent increase from value-add amenities. If you were a value-add investor or developer trying to decide the ideal renovations or features for your next project, we could tell you how much you’d get for those granite countertops, the hardwood floors – even that fancy dog washing station.

At the time, it seemed like everyone in commercial real estate was talking about the “amenity wars”. Developers were incorporating increasingly more extravagant amenities like dog runs, roof decks, infinity pools, etc. to win over tenants in an environment where many new developments were hitting the market simultaneously. Clearly there was an opportunity to help real estate investors optimize the scope of their investments.

As you can imagine, to develop precise algorithms, we had to get really, really good at extracting accurate data from listings and combining it with demographic, economic and locational amenity data to train on. I swear I must have read at least 20,000 listings as we were analyzing the various amenities that could exist in a multifamily property and how brokers referred to them. Data science isn’t always fun, but it was exhilarating when we finally perfected our algorithms, scaled our customer base, and sold the company in 2019.

One thing really stood out as I was reading all those listings… they were terribly written. So many instances of “Location Location Location!”, or all caps descriptions like “THIS IS A MUST SEE PROPERTY IN THE HEART OF…”, as well as misspellings, missing or incorrect information, and frustratingly bad grammar. Sure, most people focus on the pictures and amenities – but many of the amenities were buried in those poorly worded descriptions. I wondered why brokers didn’t spend more time crafting professional listings that would showcase the property well and improve SEO.

The Eureka Moment 5 Years in the Making

It occurred to me at the time that it might be possible to do the opposite of what we were doing. Instead of extracting data from poorly written listings, what if we could use the data to generate the perfect listing description? It was a compelling idea, but we were selling to real estate investors, not listing agents. Still, I had it in the back of my mind that solving this problem might be a way to put all my painstaking experience analyzing listings to good use.

When Open AI released GPT-3, I saw an opportunity to finally tackle the listing generation problem. I started by talking with homebuyers and renters, asking what they like and hate about listings. This is probably not news to anyone, but people hate the way brokers write listings today. They hated the all caps, the exaggeration/puffing, and the missing information, the grammar, etc. They preferred descriptions that used complete sentences and described the property in sort of a walkthrough format, where you could visualize walking through the home and seeing layout and improvements. So clearly there was room for improvement.

How Long Does it Take Today?

Then I talked with multifamily brokers. For apartment listings, they described it as a ton of copying and pasting. The listings only take around 5-10 minutes to write because they’re largely copied from previous listings. That makes sense. As you can imagine though, this doesn’t provide an accurate description of the specific unit a potential renter is looking at, and they’re certainly not using market data to optimize the descriptions. It seemed like the listing agents largely “phoned it in” on the descriptions unless it was a high-end unit represented by a luxury brokerage. Multifamily leasing is highly competitive though, so I saw an opportunity in this market to algorithmically deliver descriptions that minimize time on market and maximize rent. The time savings would be smaller, but the competitive differentiation could be very impactful.

For single family listings, brokers said it often takes 25-30 minutes to craft a listing description, and that the more expensive the property, the longer it takes. That also makes sense. If you’re listing a $1,000,000 home, it would be tough to tell the owner you only spent 5 minutes writing the listing. I found that single family descriptions were generally longer and more specific, but suffered from the same grammar and formatting issues as multifamily listings. Here I saw an even bigger opportunity – help single family brokers save 30 minutes per listing while generating much better descriptions to improve SEO and minimize time on market.

A Massive Market

Last year, there were about 6.9 million home sales in the U.S. That means there were at least as many listing descriptions written (excluding properties that were listed but not sold). The Bureau of Labor Statistics estimates there are about 44 million renters in the U.S., and with annual apartment turnover at about 50%, that means there were at least 22 million listing descriptions written for multifamily properties. So by simple math (6.9M x 30 mins) + (22M x 5 mins) = 317 million minutes, or about 5.3 million hours spent writing listings every year. And they’re poorly written for the most part, so it’s really 5.3M hours spent writing listings that people don’t like. That seems like a problem worth solving.

So I collected data from a few thousand single and multifamily listings in a few different markets throughout the U.S., and set about training an algorithm to automatically generate the “perfect” listing using attributes on the property and market. It took many iterations to develop something that sounded human, but once it was working, I realized it was pretty hard to tell the difference between a real listing and an AI generated one.

I continued to optimize, building pipelines to continually retrain the single and multifamily description algorithms on real-time listing data. This meant they could be optimized for local markets, using the descriptions that performed best in those markets (in terms of rent and time on market) to produce an optimal description. I also added data from Google places to improve localization, meaning the algorithm could convincingly refer to the best restaurants and attractions in every local market throughout the U.S. Pretty sure a broker in NYC can’t intelligently discuss the market in rural Iowa… but the algorithm can compellingly pitch a property in either market, and it can do it in under a second.

The Results: Better than Expected

When I had the algorithm in a good place, I surveyed dozens of potential homebuyers, asking them to compare real listings to my AI generated ones. It was anonymous, so they didn’t know which one was real and which was generated. How did it go? Drumroll please… 87% of them preferred the AI generated listing descriptions to the ACTUAL listing descriptions written by brokers. I knew it worked well, but that was way better than I expected.

Armed with the knowledge that I could now write listings better than about 9 out of 10 brokers in any market throughout the country, I decided I could either become a broker, or sell the algorithm via API. As a data scientist, I imagine I’d be pretty terrible in a showing. I’d probably talk endlessly about the relative value of various amenities and rail off market statistics, while the prospective homeowners wandered the property alone. Probably not a good way to sell a home or apartment. So of course I elected to sell the algorithm.

If you’re interested in a demo, check out For now, the API is focused on single and multifamily listings, but I plan to apply the same technique to optimize broker opinions of value, offering memos, sales communications, etc. That’s why I called it, an extremely clever amalgamation of “Real Estate” and “Type”. Just imagine: What if everyone in real estate stopped writing terrible prose and focused on analyzing and selling deals? That’s a future I want to make happen!

Additional Resources was designed to generate real estate listing descriptions that are extremely high quality, consistent in tone and style, comply with fair housing laws, and are cheap and nearly instantaneous to produce. You can try it out or learn more about the product from the resources below:

See how Hello Data can help your business