Launch Time!

· (5 minutes)
Article's Header

Say Hello to!

Extract, Enrich & Predict with Real Estate Data

I’m pleased to announce that I recently launched Hello Data, it is an ensemble of solutions for extracting data from real estate documents and websites, enriching it with supplemental sources, and using it for prediction and automation. After working in Data Science for Real Estate for years, I initially used my knowledge to solve my housing problems, and since I had some time to R&D, that eventually became something much bigger. Here is how the various pieces of the puzzle came together:

  • It began when I tried to extract structured data from apartment floor plans to find one that met my fiancé and my expectations in a competitive housing market with awkward floor plans (we didn’t want the main bedroom to be a narrow corridor, a common feature in Amsterdam). With this floorplan extraction algorithm, we could filter every single new listing reaching the market in less than 100ms, and only focus on those meeting our criteria.
  • Of course, we also wanted to live in a nice area. So for each potential apartment, I used Deep Learning and Computer Vision to rank the view from the street in terms of curb appeal. To achieve this, I scored streets and facades based on maintenance, style, and other factors such as the presence of trees. After refining it, the algorithm can now be computed for any building in any city in the US. And that can also be extended to the surroundings: people usually want to see nice places when they walk around their house.
  • and After I had a short list of apartments, I wanted to collect more data on each apartment from property websites and documents (PDFs) to get a more complete picture about each place - without having to label thousands of documents and not only return fields, but structured answers with hierarchies of objects, lists, etc., which is where current extraction platforms fail.

Automated Listings & Revenue Management

All that gathered data was ideal for analyzing how those detailed characteristics worked together. How much are corridor square feet worth compared to those of a larger bedroom? Does any room layout offer an unfair advantage and generate more demand than others? Are there good practices for writing listing descriptions? I could answer these questions using the technology that had already been built!

  • The arrival of GPT-3 and better large language models at that time was pure luck. I could already extract unprecedented detailed data about houses and apartments, and had seen how bad the average listing description was: typos, non-relevant emphasis, missed important points about some amenities or neighborhood characteristics. With that data and GPT-3 capabilities, I figured the problem of writing good listing descriptions could be solved. By overcoming GPT-3 weaknesses with good data and some guidance, it was possible to generate listing descriptions so well that most people I showed preferred the AI generated ones to the real thing.
  • About improving listings… the elephant in the room was getting the pricing right. With the ability to view all the data in a structured manner and estimate supply and demand, building rent adjustment recommendations using those signals was the natural next step. is a custom-built numerical simulation for each of your buildings. Thanks to machine learning, it learns the action-reactions between the market and each step of the renting funnel. How likely are people to click on your listing? To ask for a visit? To apply? To leave a unit? It’s built so that you can play with rents and see how it impacts the whole funnel. Then it finds the best tradeoff to maximize total income. It couldn’t be more transparent.

It’s more products than what startups usually get to build, but the synergy between all of them is so valuable, I believe it’s worth not doing it by the book. In the current economical context where businesses must optimize their processes, speed them up and automatize the boring work, I surely hope that Hello Data will be able to create a ton of value! Don’t hesitate to DM me if you are interested in getting a demo!

Additional Resources

Hello Data was founded by data scientists and engineers with proven real estate domain expertise to help real estate professionals and PropTech companies build data driven products. We’ve built data pipelines, predictive algorithms and workflow automation technology for startups, publicly traded companies, and everything in between. We offer a suite of APIs help you extract data from real estate documents and websites, enrich it with supplemental sources, and use it for prediction and automation. Learn more about Hello Data from the resources below:

Real Estate Data Extraction Products

  • - Unlock Valuable Data from Real Estate Documents
  • - Extract Structured Data from Multifamily Floor Plans

Real Estate Data Enrichment Products

  • - Quantify the Curb Appeal of your Real Estate Investment
  • - Add Rent, Amenity and Concessions Data to Your Pipeline

Real Estate Generation & Prediction Products

  • - Write Real Estate Listing Descriptions in Seconds
  • - Optimize Revenue for any Property Management Software


Other Resources

See how Hello Data can help your business