AGB  ·  Datenschutz  ·  Impressum  







Anmelden
Nützliche Links
Registrieren
Thema durchsuchen
Ansicht
Themen-Optionen

Forum ist schneckenlangsam. DDOS?

Ein Thema von Cypheros · begonnen am 18. Sep 2025 · letzter Beitrag vom 11. Okt 2025
 
Kas Ob.

Registriert seit: 3. Sep 2023
482 Beiträge
 
#25

AW: Forum ist schneckenlangsam. DDOS?

  Alt 6. Okt 2025, 08:03
Fail2Ban wird nicht viel bringen, da dies nur Login-Versuche ab einer frei definierbaren Anzahl von Fehlversuchen blockiert. Der meiste Traffic kommt aber nicht von Anmeldeversuchen sondern vom Aufrufen von Webseiten. Könnte allerdings auch ein Überlassungsversuch sein, um das Forum offline gehen zu lassen.
Well Fail2Ban is not that simple, it might be started like that but it is way more powerful, you can read about hot it work and the power of its custom filtering.

Anyway, i asked ChatGPT with
Zitat:
Using "Fail2Ban with custom filters" how to recognize and stop crawlers and scrappers ?
The answer was
Code:
Excellent question — and an advanced one 👏

Fail2Ban is traditionally used to block brute-force login attempts or repeated bad HTTP requests, but it can be extended very effectively to recognize and stop web crawlers / scrapers by using custom filters and jail rules on your web server logs (Apache or Nginx).

Let’s go step-by-step so you can build a reliable and non-destructive setup.

🧩 1. General Concept

Fail2Ban works like this:

It monitors log files (e.g. /var/log/nginx/access.log).

It applies regex filters to detect bad patterns (excessive requests, fake user agents, 404 floods, etc).

Once the threshold is reached (e.g. maxretry = 10 in findtime = 60s), it bans the offending IP (by default using iptables or firewalld).

🕵️ 2. What Defines a Crawler / Scraper?

Scrapers tend to show one or more of these traits:

Suspicious user-agents (empty, fake, or known bad bots).

Excessive request rate from a single IP.

Repeated 404/403s (trying random URLs).

Requests for disallowed paths (/wp-admin, /phpmyadmin, /etc/passwd, etc).

Ignoring robots.txt.

Fail2Ban can catch all these with different filters.

⚙️ 3. Example: Basic Anti-Scraper Jail (Nginx)
3.1 Create the filter /etc/fail2ban/filter.d/nginx-noscrape.conf
[Definition]
failregex = <HOST> -.*"(GET|POST).*HTTP.*" 200
ignoreregex =


But this alone bans everything (too broad).
Let’s refine.

Example detecting excessive requests with bad user-agents or too fast access (tuned via log tailing):

[Definition]
failregex = ^<HOST> -.*"(GET|POST).*HTTP/.*" 200 .*"(?:-|curl|wget|python|scrapy|bot|spider|crawler|httpclient|libwww|nikto|scan|Go-http-client).*"$
ignoreregex =


That matches requests containing those keywords in the User-Agent field.

3.2 Create the jail /etc/fail2ban/jail.local entry
[nginx-noscrape]
enabled = true
port    = http,https
filter  = nginx-noscrape
logpath = /var/log/nginx/access.log
maxretry = 20
findtime = 60
bantime = 86400
action  = iptables-multiport[name=NoScrape, port="http,https"]


This bans any IP that hits 20 bad user-agent requests in 60 seconds.

&#129504; 4. Example: Detect Floods / Too Many Requests (regardless of UA)

You can also catch excessive requests (rate-limit style):

Filter /etc/fail2ban/filter.d/nginx-flood.conf
[Definition]
failregex = ^<HOST> -.*"(GET|POST).*HTTP/.*"

Jail /etc/fail2ban/jail.local
[nginx-flood]
enabled = true
port    = http,https
filter  = nginx-flood
logpath = /var/log/nginx/access.log
maxretry = 100
findtime = 10
bantime = 600


That means “if any IP makes more than 100 requests in 10 seconds — ban for 10 minutes”.

This is crude but effective against aggressive crawlers that ignore rate limits.

&#129700; 5. Detect Repeated 404s (path probing)
[nginx-404]
enabled = true
port    = http,https
filter  = nginx-404
logpath = /var/log/nginx/access.log
maxretry = 10
findtime = 60
bantime = 3600


Filter (/etc/fail2ban/filter.d/nginx-404.conf):

[Definition]
failregex = ^<HOST> -.*"(GET|POST).*" 404
ignoreregex =

&#129521; 6. Combine with Nginx/Apache Hints

You can:

Add limit_req_zone in Nginx to rate-limit per IP (lightweight, fast).

Add custom log formats to expose user-agents or country (for fine control).

Use GeoIP to ignore trusted countries or whitelists.

&#129718; 7. Optional: Ignore Good Bots

You can whitelist good bots (Google, Bing, etc) via DNS or regex:

ignoreip = 127.0.0.1/8 ::1 66.249.0.0/16 40.77.167.0/24


or in the filter:

ignoreregex = "(Googlebot|Bingbot|YandexBot)"

&#129513; 8. Testing Your Filters

Run:

fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/nginx-noscrape.conf


This shows which lines match and helps tune your regex.

&#129521; 9. Optional — Combined Smart Detection

Some admins merge metrics (404 + high rate + bad UA) in a single jail for smarter detection.
Example composite filter:

failregex = ^<HOST> -.*"(GET|POST).*" (403|404|444|499)
failregex = ^<HOST> -.*"(GET|POST).*HTTP/.*" 200 .*"(?:curl|wget|python|bot|scrapy|Go-http-client).*"$

&#129520; 10. Maintenance Tips

Keep bantime long enough but not too aggressive (1 day is typical for bots).

Review /var/log/fail2ban.log weekly.

Whitelist known partners or CDNs.

Combine with Cloudflare / ModSecurity for layered protection.

Would you like me to show a ready-to-use bundle of filters (noscrape + 404 + flood) in one jail.local that you can drop directly into a Debian/Ubuntu Fail2Ban install?
And i followed with, "Yes show me that"

Code:
Perfect &#128077; — here’s a ready-to-use anti-crawler / anti-scraper setup for Fail2Ban that you can copy directly to /etc/fail2ban/jail.local.

This setup has three protection layers:

noscrape &#8594; catches bad user-agents

404 &#8594; catches scanners and broken scrapers

flood &#8594; catches too many requests per second

All work together and are tuned to minimize false positives while effectively stopping bad bots.

&#129521; /etc/fail2ban/jail.local
# ===============================
#  Anti-Crawler / Anti-Scraper Jails
# ===============================

[DEFAULT]
# Whitelist localhost and your internal networks
ignoreip = 127.0.0.1/8 ::1

bantime = 86400        # 1 day ban
findtime = 60
maxretry = 5
backend = auto
banaction = iptables-multiport

# ===============================
#  1. Bad User-Agent Scrapers
# ===============================
[nginx-noscrape]
enabled = true
filter  = nginx-noscrape
logpath = /var/log/nginx/access.log
port    = http,https
maxretry = 20
findtime = 60
bantime = 86400
action  = iptables-multiport[name=NoScrape, port="http,https"]
ignorecommand =
# You can increase `maxretry` if you have legitimate API traffic

# ===============================
#  2. Repeated 404s / Path Probing
# ===============================
[nginx-404]
enabled = true
filter  = nginx-404
logpath = /var/log/nginx/access.log
port    = http,https
maxretry = 10
findtime = 60
bantime = 3600
action  = iptables-multiport[name=No404, port="http,https"]

# ===============================
#  3. Request Flooding (rate-based)
# ===============================
[nginx-flood]
enabled = true
filter  = nginx-flood
logpath = /var/log/nginx/access.log
port    = http,https
maxretry = 100
findtime = 10
bantime = 600
action  = iptables-multiport[name=Flood, port="http,https"]

&#129513; /etc/fail2ban/filter.d/nginx-noscrape.conf
[Definition]
# Detect common scraping tools or fake/bad bots in User-Agent
failregex = ^<HOST> -.*"(GET|POST).*HTTP/.*" .*"(?:-|curl|wget|python|httpclient|libwww|nikto|scan|scrapy|Go-http-client|java|php|perl|ruby|bot|crawler|spider).*"$
ignoreregex = "(Googlebot|Bingbot|YandexBot|DuckDuckBot|AhrefsBot|Applebot)"


This bans clients using clearly automated or fake agents, but ignores major search engines.

&#129513; /etc/fail2ban/filter.d/nginx-404.conf
[Definition]
# Detect repeated 404 responses from same IP
failregex = ^<HOST> -.*"(GET|POST).*" 404
ignoreregex =

&#129513; /etc/fail2ban/filter.d/nginx-flood.conf
[Definition]
# Detect excessive requests (any response code)
failregex = ^<HOST> -.*"(GET|POST).*HTTP/.*"
ignoreregex =

&#128269; Test Before Activating

Run:

sudo fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/nginx-noscrape.conf


That shows which log entries match your filters — crucial to avoid banning good users.

Then enable and start Fail2Ban:

sudo systemctl enable fail2ban
sudo systemctl restart fail2ban
sudo fail2ban-client status


You’ll see jails like:

Jail list: nginx-noscrape, nginx-404, nginx-flood


To inspect bans:

sudo fail2ban-client status nginx-noscrape

&#9881;&#65039; Optional Tweaks

For Apache, just change logpath to /var/log/apache2/access.log and the filters will still work (format compatible).

If your log format differs, adjust the regex (especially if you use JSON or custom fields).

Add your whitelist under [DEFAULT] &#8594; ignoreip.

Would you like me to extend this setup to log and email ban notifications (showing IP, user-agent, and request pattern)? It helps to analyze who’s hitting your site and why.
Kas
  Mit Zitat antworten Zitat
 


Forumregeln

Es ist dir nicht erlaubt, neue Themen zu verfassen.
Es ist dir nicht erlaubt, auf Beiträge zu antworten.
Es ist dir nicht erlaubt, Anhänge hochzuladen.
Es ist dir nicht erlaubt, deine Beiträge zu bearbeiten.

BB-Code ist an.
Smileys sind an.
[IMG] Code ist an.
HTML-Code ist aus.
Trackbacks are an
Pingbacks are an
Refbacks are aus

Gehe zu:

Impressum · AGB · Datenschutz · Nach oben
Alle Zeitangaben in WEZ +1. Es ist jetzt 14:38 Uhr.
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO © 2011, Crawlability, Inc.
Delphi-PRAXiS (c) 2002 - 2023 by Daniel R. Wolf, 2024-2025 by Thomas Breitkreuz