Search Engines
Despite being a free product, free search engines make a lot of money. Kagi’s Why Pay for Search says that, “In 2022 Google generated USD $224.47 Billion dollars from advertisement revenue while processing approximately 8 Billion searches per day. At 365 days per year this amounts to approximately USD $0.07 revenue per search. If an average user searches 5 times per day, assuming a 30 day month this results in Google generating USD $11 revenue per user per month.“
These products are in business of collecting as much data as possible about you, creating a profile that can be extremely creepy.
Signal, the end-to-end encrypted messaging app, tried to purchase ads that would show this kind of collection on Facebook. This article from Business Insider is worth a read.
Google and others are able to create these profiles because we willingly submit our data to them to try and find information online about things we are interested in. We search for “college football team tickets” or “is it normal that my eye is itchy in october” or “where to find friends in Montana” and more. The profile they make on you is sold to advertisors that are looking to target people in your very specific demographic.
I don’t want companies knowing this kind of information about my wife, my family, my friends, myself, or anyone. An easy step in the right direction to get away from this business model is to use a privacy friendly search engine. I read through the privacy policies of six different options and compiled a list of helpful information for making an informed decision about which search engine service you should use.
Privacy Friendly Search Engines
DuckDuckGo
-
Ads based on search query
-
Uses the “DuckDuckBot” for web crawling, but also sources a lot of their results from Bing
-
Collected Data
When you visit our search engine or our other websites, your device sends some information about itself automatically, like its IP address, browser type, and language, and may send additional information upon request like its screen size, operating system, and preferences. However, we only use this information temporarily to deliver content to you and, for security, to ensure you’re not a malicious bot. We don’t save your IP address or any unique identifiers alongside your searches or visits to our websites. We also never log IP addresses or any unique identifiers to disk.
- According to this, DDG trashes any data that would be able to identify you as soon as they provide your search results.
... searchers often need accurate location-based results like local weather and restaurants. Interestingly, we can actually serve results (including instant answers and ads) for searches like these while still keeping you anonymous. To do this, we simply guess your location by default using a GEO::IP lookup with the IP address that is automatically sent to us via your device; then we throw away both the guessed location and the IP address, per our privacy policy. This process does not need to request any additional information than what you are already sending.
Ads By Microsoft on DuckDuckGo Private Search Viewing ads is privacy protected by DuckDuckGo. We partner with many different information sources to deliver DuckDuckGo Search (e.g., Microsoft for ads, Apple for maps, etc.). When you view search results (including ads), your searches cannot be tied back to you, either by us or our partners. How this works technically is we do not store any personal identifiers (e.g., IP address) with your search terms, and we also proxy all requests to partners through us.
- DDG proxies the Microsoft ad requests through their servers, so Microsoft does not get your IP address.
This actually surprised me. I expected to see more hidden tracking in their privacy policy, but they seem pretty good. No tracking, but shows you ads.
There was a brief scare with DuckDuckGo when it was found out that they had Microsoft trackers in their mobile browser. I haven’t seen anything that should concern anyone for using their search engine.
Brave Search
-
Ads
-
Paid option
-
Uses their own crawler and index for their results with an opt-in option to use Google if the results aren’t what you expected
-
Collected Data
- Country of user that clicks on ad
- Records clicks on ads to see how many people interact with the respective ad
- Opt-out Metrics that collects the following:
Number of daily/weekly/monthly visits Number of returning visits Number of search queries per day How long you’ve been using Brave Search Average query length How many users have chosen to leave feedback about Brave Search The operating systems people use when they visit (e.g. macOS, Windows, etc) The browser you’ve visited from (e.g. Brave, Chrome, Safari, etc) Anonymous clicks and views of Ads appearing on Brave Search Country location associated with clicks and views This data—if you allow us to collect it—is anonymous and only analyzed in aggregate.
Brave currently partners with Amazon to expand product ad inventory for Brave Search advertising. This partnership is done in the spirit of Brave’s commitment to privacy. For queries with shopping intent that we would consider relevant for Amazon, Brave provides Amazon with the following non-personal data: Query User agent (browser) Truncated IP address IPv4 ― Only top 24 bits IPv6 ― Only top 48 bits
- Amazon uses this data in accordance with their Privacy Notice.
- Amazon can’t determine your location from the truncated IP address, but they can make an educated guess into creating a profile based on the data Brave shares with them.
Anonymous local results Brave does not know your location. If you’d like localized search results you can choose to filter based on your computer’s IP address. Neither your IP address nor any other geographical data will be stored or shared.
-
Brave Search Premium
- $3/month to remove ads
- Payments through Stripe
Brave is also better than I thought it was. I don’t appreciate that their metrics are opt-out (on by default), but at least they offer the option. I do wish there was an option to opt-out of the Amazon shopping results.
There was a brief scare with Brave when it was found out that they were automatically injecting their affiliate link for Binance in their browser. I haven’t seen anything that should concern anyone for using their search engine.
Qwant
-
Ads based on search query
-
Qwant uses its own index
-
Collected Data (Claimed to be pseudonymized)
- Hashed value of IP Address (with a salt changing every three months at the latest)
- User Agent
- market segment of a request
- Date and time of search
- Country
- Language
- search keywords
- where a user came from
- search box used to trigger a query
- type of device used
- source of visit
- operating system
- major browser version
Qwant keeps for 1 month (from the 1st of the month) the keyword(s) entered associated with a pseudonymous identifier calculated from the User Agent of your browser and the hash salted with your IP address. After this period, keywords are no longer associated with an identifier and kept for 12 months for the purpose of aggregated statistical analysis (e.g. how many times a keyword is searched for over a given period).
- While Qwant doesn’t know what your IP address is (once stored), this correlation technically gives Qwant a referential identity to you. It is anonymous, so they wont know that you are John Doe, but they will know that “IP-Adress-Hash-Firefox-Version-162” searched for “best cream cheese for a bagel” on September 21, 2023. That correlation goes away after a month, so they claim.
In order to provide you with relevant results from around the world when we do not have the answers to your queries ourselves, we partner with Microsoft Ireland Operations Limited to provide a portion of our search results and provide contextual advertising based on the keywords entered and your geographic region. ... In addition, for security purposes and reliability of our partner’s services (detection of spam, automated activity, fraudulent clicks on advertisements …), Qwant may also collect and transfer to this partner your full IP address.
Why do you transfer data to Microsoft, and which ones? Microsoft provides some of the search results you see in our pages, and provides ads related to the keywords in your query. We must therefore send Microsoft certain information related to your search that allows our partner to send us results and advertisements relevant to that search, and to combat fraudulent clicks or other activities that are not permitted by our usage rules. To detect fraud, Qwant uses a specialized service offered by Microsoft, which does not have access to the keywords of your search. Only your IP address and the User Agent of your browser are communicated to this specialized service, to calculate a fraud probability score. The search keywords are sent separately to another service that does not know your IP address.
- Apparently they send your IP address (non-“pseudonymized”) and browser agent to Microsoft with every search!
So, it seems like Qwant is decent for privacy, but you really need to use a VPN. Otherwise Microsoft is going to get your IP Address with every single search. I actually don’t think I would recommend Qwant at all based on this.
Whoogle
The Privacy Policy of the instance you use is what matters. Here is the GitHub Link for more information.
-
Self-Hosted
-
Results come from Google
-
Data Collection depends on the Whoogle instance you are using.
- By default, Whoogle offers the following:
No ads or sponsored content No JavaScript* No cookies** No tracking/linking of your personal IP address*** No AMP links No URL tracking tags (i.e. utm=%s) No referrer header * No third party JavaScript. Whoogle can be used with JavaScript disabled, but if enabled, uses JavaScript for things like presenting search suggestions. ** No third party cookies. Whoogle uses server side cookies (sessions) to store non-sensitive configuration settings such as theme, language, etc. Just like with JavaScript, cookies can be disabled and not affect Whoogle's search functionality. *** If deployed to a remote server, or configured to send requests through a VPN, Tor, proxy, etc.
- The IP address of server that is running Whoogle will be the IP that Google sees when searching
- This means that if you are self-hosting, you should run your Docker container (or whatever you’re using) through a VPN to stay anonymous
- Google can also profile the server IP with you search queries.
- If you are self-hosting, it is best to share your instance with others to prevent Google from creating the profile entirely based around one person.
- By default, Whoogle offers the following:
Whoogle is good if you have an instance with multiple users and only want results from Google. While you can self-host this, I wouldn’t recommend hosting it as a private instance.
SearXNG
- Self-Hosted
- Metasearch engine which aggregates results from more than 70 search services
How does SearXNG protect privacy?
SearXNG protects the privacy of its users in multiple ways regardless of the type of the instance (private, public). Removal of private data from search requests comes in three forms:
removal of private data from requests going to search services
not forwarding anything from a third party services through search services (e.g. advertisement)
removal of private data from requests going to the result pages
Removing private data means not sending cookies to external search engines and generating a random browser profile for every request. Thus, it does not matter if a public or private instance handles the request, because it is anonymized in both cases. IP addresses will be the IP of the instance. But SearXNG can be configured to use proxy or Tor. Result proxy is supported, too.
SearXNG does not serve ads or tracking content unlike most search services. So private data is not forwarded to third parties who might monetize it. Besides protecting users from search services, both referring page and search query are hidden from visited result pages.
- Much like how Whoogle works, the IP address of server that is running SearXNG will be the IP that the proxied search services sees when searching
- This means that if you are self-hosting, you should run your Docker container (or whatever you're using) through a VPN to stay anonymous
- The proxied search services can also profile the server IP with you search queries.
- If you are self-hosting, it is best to share your instance with others to prevent search services from creating the profile entirely based around one person.
SearXNG is very similar to Whoogle, but you can get search results from a lot more search services. I’d still recommend using a public instance or if self-hosting, open your instance up to other users.
Kagi
-
Paid
-
Search results are aggregated “data from multiple other sources, including but not limited to Google, Bing, and Wikipedia, and other internal data sources”
-
Data Collected
IP Addresses and Geolocation Kagi has features that either require or are enriched by knowing the client's physical location, such as our Maps product. When you connect to any website on the internet, you broadcast a source IP address to the server. This is a part of the IP protocol, on top of which internet traffic is built upon. This is the IP that Kagi uses to fulfill its geolocation lookups. It cannot be omitted from the protocol, so Kagi cannot say "no thanks" even if we wanted to. But there are means of spoofing the value to something else. The source IP is often provided by whatever router you are connected to, advertising the IP address that it has been leased by your ISP.
- Anonymous logs are aggregated with GCP’s logging tools, retained for 30 days.
- Anonymous logs are shared with Sentry when bugs, crashes, or warnings that occur for debugging purposes.
Warrant Canary We, Kagi, are committed to being transparent and taking full control of our service. Private information of our users has never been disclosed or seized, nor have we been compromised or suffered a data breach. Kagi has received: - 0 National Security letters; - 0 Gag orders; - 0 Warrants from any government organization; To ensure your privacy and security, we don’t monitor, log or store your queries or associate them with your account.
Kagi seems to be the best of the bunch. No ads, no logging, and they have a transparent warrant canary that proves it. Kagi is useable with a free trial that allows you to search 100 times. After that Kagi becomes a paid service. Kagi offers a fair price for the product. $5 per month gives you 300 hundred searches per month. $10 per month gives you unlimited searches per month. Kagi also offers two different family plans: $14 per month for two users and $20 per month for up to six users, both options include unlimited searches. Paying annually offers a 10% discount for all options.
Which one should I choose?
That’s up to you and what your threat model is. Each one of the products I listed have pros and cons. They all have different and unique feature sets. They all pull from different search indexes. You should choose the one that provides the best results and amount of privacy you desire.
After going through each of the privacy policies, my personal rankings would be: Kagi, SearXNG, DuckDuckGo, Whoogle, Brave, Qwant1.
-
Qwant has been struckthrough as I cannot recommend it due to the service sending your IP address to Microsoft with every search. ↩︎