Configuration¶
Dead Simple Search is configured entirely through environment variables. This means you don't need to edit any configuration files — you set values in your shell or your deployment platform, and the application reads them when it starts.
What's an environment variable?
An environment variable is a named value that lives outside your application code. It's a common way to configure software, especially in server environments. You set one by typing something like export MYSQL_PASSWORD=secret in your terminal before running the app.
Database settings¶
These tell Dead Simple Search how to connect to your MySQL database.
| Variable | Default | Description |
|---|---|---|
MYSQL_HOST |
127.0.0.1 |
The address of your MySQL server |
MYSQL_PORT |
3306 |
The port MySQL is listening on |
MYSQL_USER |
deadsimplesearch |
Database username |
MYSQL_PASSWORD |
deadsimplesearchpass |
Database password |
MYSQL_DATABASE |
deadsimplesearch |
Name of the database to use |
Change the default password
The default password is only meant for local development. Always set a strong MYSQL_PASSWORD in production.
Crawler settings¶
These control how the crawler behaves when visiting your website.
| Variable | Default | Description |
|---|---|---|
CRAWL_DELAY_SECONDS |
1.0 |
Seconds to wait between requests. This is a "politeness" setting — it avoids overloading the target server. |
CRAWL_MAX_PAGES_PER_SITE |
10000 |
Maximum number of pages to crawl per site in a single run. |
CRAWL_REQUEST_TIMEOUT |
30 |
How many seconds to wait for a single page to respond before giving up. |
CRAWL_USER_AGENT |
DeadSimpleSearchBot/1.0 (+https://example.com/bot) |
The name the crawler identifies itself with when visiting pages. Website owners can see this in their server logs. |
Crawl delay explained: When the crawler visits a website, it pauses between each page request. A delay of 1.0 seconds means it fetches at most one page per second. Lower values make the crawl faster but put more load on the target server. Be respectful — most website owners appreciate crawlers that don't hammer their servers.
Scheduler settings¶
The scheduler lets you automatically re-crawl all your registered sites at regular intervals (that's what "scheduled" means — it runs on a timer, like a recurring alarm).
| Variable | Default | Description |
|---|---|---|
SCHEDULER_ENABLED |
false |
Set to true to enable automatic re-crawling. |
SCHEDULER_INTERVAL_HOURS |
24 |
How many hours between each re-crawl run. |
Example — re-crawl every 12 hours:
Flask settings¶
These control the web server itself.
| Variable | Default | Description |
|---|---|---|
FLASK_HOST |
0.0.0.0 |
The network address to listen on. 0.0.0.0 means "accept connections from anywhere." |
FLASK_PORT |
5555 |
The port number the API listens on. |
FLASK_DEBUG |
false |
Set to true for development. Gives more detailed error messages but should never be used in production. |
Example: production setup¶
Here's what a typical production launch might look like:
export MYSQL_HOST=db.internal.example.com
export MYSQL_PASSWORD=a-very-strong-password
export MYSQL_DATABASE=deadsimplesearch
export CRAWL_DELAY_SECONDS=1.5
export CRAWL_USER_AGENT="MySearchBot/1.0 (+https://mysite.com/bot)"
export SCHEDULER_ENABLED=true
export SCHEDULER_INTERVAL_HOURS=24
export FLASK_PORT=8080
python app.py
Using a process manager
For production, consider running Dead Simple Search behind a process manager like systemd or supervisor. This ensures it restarts automatically if it crashes, and makes it easy to manage as a service on your server.