Every time we demo HelloData.ai, we get similar questions. The first question is nearly always “Where do you get your data?” to which we explain that we go to thousands of property websites and several listing sites every day to collect updated rent and availability info. Then there are usually some questions on how our QualityScore algorithm works, how we’re different from other products, etc.
I wanted to write this series of posts to answer each of the most common questions we hear in demos, and to give some specific examples of how our approach delivers the best rent comps, highly accurate expense benchmarks, and a greatly accelerated multifamily market analysis process.
Where does HelloData get its data?
We monitor thousands of property websites across the U.S., as well as seven different rental listing sites, collecting data on every unit that hits the market, every day. Because we survey rents daily, we capture the first time each unit appears, how the price changes over time, and the last price before the unit is removed from the market…. which is remarkably close to the leased rent for each unit.
How close? On several demos, we’ve compared the last listed rent for each unit to values on customers’ rent rolls, and we 9 out of 10 times we hit the exact same value. In other cases, we’re only off by maybe $5-10 from the values on their rent rolls.
If you think about it, this makes perfect sense. If a prospective resident walks in the door saying they saw a unit listed for $2,000 a month, would any reasonable property manager say “Sorry, the new rent is actually $2,200.”? No way. Professional managers aren’t going to bait and switch at the door. If they did, they could quickly rack up negative reviews or even be liable for fair housing violations.
By capturing data at the unit level every day, we display the closest thing to actual rents you can get without owning the property. This has been one of our biggest competitive differentiators.
Which markets does HelloData cover?
We cover every U.S. state. We collect data on millions of listings across the U.S. every day, and we train our expense benchmarking algorithms on real financial data from over 25,000 multifamily properties nationwide. For more on our data, check out this page, which goes into detail about how we collect and process data.
Can HelloData analyze new developments?
Yes! We just released an update that lets users enter completely new properties by filling in year built, unit mix, quality and amenities to get accurate comps for new developments. With only an address and some high-level information, users can essentially produce a full feasibility study for a new development.
Apartment listing data is inaccurate, how does HelloData make sure the information is accurate?
We use a combination of techniques, including:
Photo Analysis - Because we’re analyzing every listing photos, can tell if someone copies a photo from a property website, slaps their logo on it, and lists it on Zillow. We’ve heard over 45% of the listings on free listing sites are fraudulent. What often happens is leasing agents create a fake listing with a watermarked logo and slightly different address to capture prospective tenants and bring them to the property manager for the leasing commission. Our approach removes the vast majority of these fake listings.
Data Merging – We have algorithms that compare all of the data in one listing (meaning the address, year built, beds, baths, description, photos, etc.) to all of the data in similar listings to identify bad data. Our algorithms can identify which listings were posted by the property manager vs other brokers, and we remove garbage listings this way.
Outlier Removal – Our data pipeline includes algorithms to detect and remove outliers. Where rents are too high, or sqft values are too low, or other data points are well out of line with the market and our historical records for the property, we catch and remove them automatically.
What do you do if rent concessions are already included in the advertised rent?
In some markets, people advertise effective rents (from which concessions are already subtracted) AND they display the concession on the property website. In these cases, it’s possible to double count the concessions.
We haven’t addressed this yet, but our plan to address it is to make the concessions popup interactive, with a checkmark to include or exclude the concession in the effective rent. This will also help address edge cases where the lease term is not addressed in the specials text – we’ll let the user change the data if anything is missing.
What is the QualityScore?
QualityScore is an algorithm we developed to extract data and insights from real estate photos.
Here’s the story of our QualityScore algorithm. Early in 2023, we trained a computer vision algorithm on tens of thousands of listing photos to detect room types, extract amenities, and objectively assess real estate condition and quality from real estate imagery. It took a lot of manual effort, as we personally labelled the photos to ensure high quality.
It worked very well at first. The predicted room types and quality matched very well with our training data.
Then we had a value-add investor use our API to analyze every rental property in the city of Philadelphia to identify the “worst house on the best block” programmatically. He was looking for deals where the property was in a great neighborhood, but the property condition was poor – ideal value-add candidates.
The outputs were good, and he was able to find some great opportunities.
The issue was, if the lighting was poor, or the photo was blurry, the angle was bad, etc. the algorithm was unfairly penalizing those photos. He showed us on some of his own deals, where they were new construction, but since he took the photos himself with his phone and in a poor lighting environment, the predicted quality was lower than it should have been.
Shortly after this, we were approached by an appraiser who wanted to use our API to automatically label photos in an automated appraisal application he had built. To help bolster the outputs, he sent us around 150k labelled photos from his single and multifamily appraisals.
Appraisal photos are very different from listing photos. Nothing was staged. Messy bedrooms, poor lighting, blurry/grainy photos… nothing like a professional set of listing photos. This was exactly what we needed to balance the outputs.
So we cleaned and structured the new data, re-trained the algorithm with half listing photos and half appraisal photos, then re-ran it on the photos from before – the results were perfect.
By using a training set comprised of both listing and appraisal photos, we built an algorithm that gets to the underlying quality of the asset, regardless of lighting and staging.
What makes a property have a score of 1 vs a score of 10?
Our ratings are on an absolute scale. We looked at photos from the worst possible units in the most dilapidated apartments in Chicago, and gave those all a 1. Deplorable quality and condition. Then we looked at the highest quality penthouse units in brand new developments, and gave them a 10 out of 10. Every photo falls somewhere in that range.
This applies not only to the property as a whole, but to each individual photo. We score each room and common area separately, then weight the individual scores to generate an overall score for each building. This gets us a consistent and objective way to compare any two properties, which we use extensively in our proprietary comparable property detection algorithm.
In the next post, I’ll cover how our rent comp detection algorithm works, how we analyze competitor pricing strategies by looking at daily listing data, and how our market-driven revenue management system works. Stay tuned!
Marc worked in real estate for 5 years before launching multifamily analytics startup Enodo, which he sold to Walker & Dunlop (NYSE: WD) in 2019. At W&D, he served as Chief Product Officer, developing products that helped source billions in loan volume. Outside of work, he enjoys reading, running, and spending time with family.