Skip to content

wfn/mycene

Repository files navigation

TL;DR

Toolkit for internet scanning and analysis. One tool in early working state right now:

  • mipdb: multithreaded masscan scan importer, parser, whole ipv4 0.0.0.0/0 space analysis from TCP SYN scans (builds its own internal DB, persists)
  • ascii diagram will show intended overall pipeline
  • just gcc mipdb.c -o mipdb

Mycene

Μυκήνη (Mykene). A nymph, daughter of the river god Inachos of Argos.

(I've decided to make this WIP repo public with one prototype tool (mipdb) working (PoC) because it's good to not hide one's code; I intend to slowly keep working on this whole thing.)

What this is

This repo is supposed to hold toolkit for multistage internet-wide scanning, results analysis and presentation. See diagram below and what is marked as actually working (masscan import, proof of concept conversion into useful data structures (this does work) and PoC quick analysis repl, very simplistic). Basically: jump to mipdb below.

Toolkit diagram (flow from top, components at bottom)

Component enumeration not complete, internal structure very WIP.

Owning                                                                         
Entity:     DRAFT FLOW SUMMARY:                                                
            ===================                                                
                                                                               
           ┌──────────────────┐                                                
User       │Define scan scope │                                                
           │(IP range, ports) │                                                
           └───────┬──────────┘                                                
                Data: configs                                                  
           ┌───────▼──────────────┐                                            
Workflow   │Auto provision VPS,   │                                            
mgr        │provision rentable IPs│                                            
           │if needed             │                                            
           └───────┬──────────────┘                                            
                Data: configs                                                  
           ┌───────▼────────┐         ┌──────────────┐                         
Workflow   │Phase1 scan:    │ Import1 │Multistage    │                         
mgr        │Masscan, sharded┼─────────►Modular       │                         
           │TCP SYN only    │         │Results data  ┼─────┐                   
           └───────┬────────┘ ┌───────┼Analysis & DB │     │                   
                Data: IP lists│ Provide  * mipdb PoC │     │                   
           ┌───────▼──────────▼┐      └──▲───┬────▲──┘     │                   
Workflow   │Phase2 scan:       │         │   │    │        │                   
mgr        │nmap, service      ┼─Import2─┘   │  Import3    │                   
           │discovery incl. -sV│             │    │        │                   
           └───────┬───────────┘             │    │        │                   
                Data: IP lists               │    │        │                   
          ┌────────▼─────────┐               │    │        │                   
Workflow  │ Phase3 scan:     │               │    │        │                   
mgr │     │ nmap, scripts,   ◄────Provide────┘    │        │                   
    │     │ more details tbd ┌────────────────────┘        │                   
    │     └──────────────────┘                             │                   
    │                                                      │                   
    │                            ┌─────────────────────────▼────────┐          
    │      ┌────────────────┐    │mipdb: scan results DB & Analysis │          
    └─Is───►Workflow manager┼────► * PoC implementation exists      │          
           ├────────────────┘Uses└──────────────────────────────────┘          
           │  ┌────────────────────────────────────────────────────────────┐   
           │  │Scan orchestrator                                           │   
         Manages                                                           │   
           │  │ - masscan mgr (phase1: host & port discovery)              │   
           │  │ - nmap mgr (phase2: service identification & banners)      │   
           ├──►            (phase3: service probing, version checks        │   
           │  │             via nmap scripts)                              │   
           │  │ - selective re-scan (some kind of modular phase thing)     │   
           │  │ - fetch results, discover remote files if can't find easily│   
           │  └────────────────────────────────────────────────────────────┘   
           │  ┌───────────────────────────────────────────────────────────────┐
           │  │Scan host provisioner                                          │
         Manages                                                              │
           │  │ - right now 10-20 VPS manually created (hourly billing),      │
           │  │   some bash & rsync to set up masscan with shard option       │
           │  │   (VERY useful option, i don't hear enough ppl using it,      │
           └──►    very easy just need to share same random seed)             │
              │   and results file fetch via ssh/rsync (iirc ssh - sftp)      │
              │ - it's simple to define sequential hostnames (s1-s16 e.g.),   │
              │   tell masscan which shard it is easily then, and fetch e.g.  │
              │   s1.bin (bash iterates and fetches)                          │
              │ - I want a proper tool, not this hacky stuff, though          │
              │ - and most VPS have API - e.g. here I used scaleway           │
              │ - and so I want to try out their API                          │
              │ - and include its client as a plugin thing                    │
              │ - so that you could easily add your own using simple interface│
              └───────────────────────────────────────────────────────────────┘

mipdb

Note: it can start simple REPL so you can keep querying kinda-interactively (I want realtime ncurses based feedback though).

% ./mipdb --help
IPv4 Scan Database Processor v1.0

Usage: ./mipdb [OPTIONS]

Options:
  -p, --parse FILE     Parse scan results from FILE
  -o, --output FILE    Save database to FILE
  -i, --input FILE     Load database from FILE
  -s, --stats          Analyze and print database statistics
  -c, --counts         Write count of IPs per every active /16 to ip16_counts.txt
  -l, --lookup IP      Look up ports for a specific IP address
  -n, --subnet CIDR    Search for IPs in a subnet (CIDR notation)
  -m, --memory         Display detailed memory usage statistics
  -t, --threads N      Use N threads for processing (default: 4)
  -w, --wipmem         WIP dev option: print memory related estimates for future work
  -h, --help           Display this help message

Examples:
  ./mipdb -p scan_results.txt -t 16 -o database.ipdb
  ./mipdb -i database.ipdb -s
  ./mipdb -i database.ipdb -l 8.8.8.8
  ./mipdb -i database.ipdb -n 192.168.1.0/24

Scan the internet with masscan, obtain say 30mil search results (ip+port) pairs, then import it all here, and have realtime way to look up results including any arbitrary ip ranges

Very reasonable memory usage (though I've tried a version with bitfield (from port map) per every ip, it works but you know, lots of memory (arguably) wasted (especially if not compressing at all).

TODO for mipdb

Better README, this one's hastily typed in.

Lots of stuff besides the below, quick ones

  • IPv6 support
  • support importing masscan binary format (so no need to convert to masscan list which I currently do)
  • robust format support (ideally with banners e.g. optional, etc.; so no implicit assumption that for banners one will use e.g. nmap)
  • there's a bug while (probably) freeing memory on shutdown, I'm just lazy but actually curious what this is, should be easy to spot, maybe single breakpoint at most
  • actually useful output options (programmatic, to file, when to terminal - more concise, more info, colours too I like them)
  • export to Parquet
  • ..and btw I do want to see how things look using DuckDB

I do want to add proper prefix trie (router tables style), put in every autonomous system there, have every range, and then

  • even more interesting lookup and range queries
  • bulk aggregate stats (density, per AS info, etc.)
  • generate detailed interesting Hilbert curves (and other visualisations maybe), with lotsa info
  • I'd like to expose then this db as a web service, for showing off benchamrking if nothing else; but since effective cost is close to nil, having a free reliable service of this kind could prove useful to some

More plans and more optional roadmap items / ideas, I will include them later. Right now I just want to get in the habit of not hiding all the repos, cleaning code sooner than later, having something at least semi functional structured (incl. directory/repo wise) in a reusable / decent way.

TODO mipdb name

Better name? This one is almost procedurally generated, multiple "quick how to name gcc output" iterations:

IPv4 DB (and DB builder from masscan search results), multithreaded version.

And that's what it is.

mipdb build

Just compile with a C99 compatible compiler (need to lint my own code though...), as in

gcc mipdb.c -o mipdb

(should work in most places, as in (at least general compile if not invocation) all Unix-like systems, gnu/linux, BSD family (incl. mac), but let me know if I'm wrong). I mean while on Linux it'll likely be actual gcc, on my macos gcc resolves to clang - and that's fine, everything works.

Normally, no flags and no libs / includes outside stdlib are required but I've only built this on macos (arm64) and maybe amd64 Ubuntu (latter maybe once).

Non-unix-like / Windows users

  • should compile it the same way they'd compile a single-or-two-source-file C code
  • but which as of now uses POSIX stuff (from e.g. <unistd.h> which mipdb.c #include's)
  • so approach same way you'd approach some C which relies on POSIX
  • Cygwin and MinGW provide their versions of unistd.h, for example
  • but I haven't tried building on Windows

Report issues

Esp. if my assumptions above are wrong.

But also any build issues.

And also any issues in general. (Maybe not life-broad)

About

just a wee lil scanning thingy

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published