Show Idle (> d.) Chans

Results 1 ... 66 found in asciilifeform for 'from:billymg crawler'

2022-06-20 billymg i'm going to rent a server as a stopgap until another dulap becomes available in asciilifeform's rack. i'm going to move the crawler (currently on ec2) and the logger (currently on an rk) to this box. additionally i'll spin up a trb node on it, wainot
2022-05-06 billymg might be something worth time series charting on the crawler www now that i think about it (trb distance behind prb)
2022-05-06 billymg whaack: i noticed on the crawler a day or so ago all the trb nodes stuck at like 133XXX (i wanna say 133411 or around there) when prb was well into the 134XXX range
2022-04-27 billymg yeah the crawler www is on ec2 now, the rk couldn't handle all the work done for the homepage
2022-04-16 billymg new site is up and all crawler related stuff has been temporarily moved to ec2. the logger is still on the rk in asciilifeform's rack, only now with more resources to itself
2022-04-04 billymg btw, still a WIP but i've got an updated version of the crawler www running here now:
2022-04-04 billymg whaack: for the longest time no other nodes were returning it as a peer (which is currently the only way the crawler can discover a new node). i noticed yesterday that it was finally found and queried to see who had reported it
2022-04-03 billymg looks like the crawler finally found whaack's new node
2022-03-25 billymg nothing appears to be wrong with the crawler either, its logs continue to print out "found new peer!" almost hourly (which it finds when it pings and gets a response from a previously unknown peer returned to it by the getaddrs call to an existing node)
2022-03-25 billymg ah, this reminded me. whaack, i queried the crawler's data to answer your question:
2022-03-22 billymg not the www portion, just the actual crawler part
2022-03-22 billymg yeah, don't think i'll need anything fancy. i've started reading SICP and i think when i get far enough along in that i want to re-write my crawler in lisp
2022-03-11 billymg whaack: in the case of those two you spot checked just now, the second indeed hasn't returned peers in a while (possibly ever, the crawler doesn't track this)
2022-03-11 billymg whaack: asciilifeform's watchglass intentionally does not include the relay byte in order to be compatible with trb. my crawler tries both with/without that byte in order to coax peers out of the node being probed
2022-03-11 billymg which is currently the only way for the crawler to discover new nodes, until this feature is added
2022-03-11 billymg whaack: for some reason no other node scanned by the crawler has returned your new node as one of its peers
2022-03-09 billymg yeah dunno, your node is obviously connected to (and returning) plenty of peers. just that none of those peers has included your node in its list yet (at least not in what it returns to the crawler's getaddr requests)
2022-03-09 billymg i'm a bit curious why the crawler hasn't picked up your new node by now. all of those peers returned by watchglass are heathen nodes btw, i just looked them up in the db
2022-03-06 * billymg just went to press a new patch for the crawler and realized there's a typo in the root directory, will have to regrind the first two
2022-02-25 billymg finally getting back to working on the crawler, i've implemented geolocation (ty for the recommendation punkman) and time series data collection, for charting
2022-02-14 billymg whaack: makes sense. like i said my crawler was network i/o bound when single-threaded, adding threading allowed it to send out pings and process results from 100s of nodes simultaneously (whatever you set the max_sockets knob to in the crawler's config)
2022-02-14 billymg my crawler uses threading only because the only bottleneck there was network io (waiting for node responses), so a single python thread is more than enough
2022-02-14 billymg the logotron and crawler both run on flask atop apache so unfortunately i'm already familiar with it
2022-01-30 billymg prior to that it was getting some bogus queries through the crawler www, e.g. lookup info where host=drupal.php
2022-01-25 billymg asciilifeform: oh, heh, crawler lost its pg connection (doesn't have the auto-reconnect feature yet), probably what freed up the resources for the logger
2022-01-20 billymg no rush on my end, there are still features i'd like to add to the crawler, and some guides i'd like to publish
2022-01-13 billymg yeah, i was thinking of adding bot UI to crawler
2022-01-05 billymg yeah, it's on the same box as the crawler, so that could have something to do with it
2021-12-01 billymg asciilifeform: any word on when this will be ready? i'm working on some updates to the crawler's www and could use the extra horsepower
2021-09-17 billymg alright, i appreciate the info, will look more into SQLite and maybe do a test of it in the crawler. in the meantime might see about just adding some reconnect logic to these programs
2021-09-17 billymg my setup is fairly small/simple. only two programs writing (logger and crawler) and two reading (their respective wwws)
2021-09-17 billymg asciilifeform: so potentially the crawler is at times tying up postgres such that it times out for the logger?
2021-09-17 billymg hmm, actually possibly the crawler still has its connection now
2021-09-17 billymg the damn thing keeps losing its postgres connection (same thing happens to my crawler too, and they both stop working at the same time until restarted)
2021-09-08 billymg caught a bug in my crawler's genesis though, where two of the sql queries use a different index name than the one that gets defined when initializing from bitdash_schema.sql. i'll post a regrind of the genesis soon but if anyone runs into it the fix is to change the two instances of 'ON CONFLICT ON CONSTRAINT unique_host DO UPDATE SET' to 'ON CONFLICT ON CONSTRAINT
2021-09-08 billymg << this method works, was able to get the crawler running. i installed all the python libs i needed by specifying exact versions, e.g. `pip install -Iv psycopg2==2.8.6`, and at least with my small list of required deps (flask, psycopg2, and requests) all were available
2021-09-07 billymg asciilifeform: i plan to write up a complete guide for this build, including the source files and tarballs where not the default, once this is all done (so far have working mp-wp, just need the crawler and logotron now)
2021-09-04 billymg cgra: ^ essentially that list, though asciilifeform's watchglass has a configurable knob for 'peershots', my crawler has that set to 5, not sure what alf's watchglass is set to
2021-08-11 billymg << the 36 number is any TRB node the crawler has encountered since it started running on my server. the homepage has a now, perhaps more useful, trb nodes active in last 48hrs
2021-07-21 billymg my crawler uses watchglass via an 'import watchglass' statement at the top of the file, but i'm only using a couple methods out of it
2021-07-20 billymg asciilifeform: yeah, i swear when i was running my crawler previously, a month or so ago, trb nodes always returned reasonable number of nodes (double or low triple digit counts)
2021-07-20 billymg punkman: fake as in not even the heathen crawlers count them as real or ever having existed
2021-07-17 billymg just restarted the crawler with peershots=5, it finishes scanning all nodes in the network in about 20 minutes
2021-07-17 billymg signpost: interesting, the crawler results do seem to show that it's capped somewhere at about 2000 (i've never seen higher than 2001)
2021-07-17 billymg the crawler www is now browsable
2021-07-13 billymg the prb crawler has an api, maybe later at some point i could add in some automated cross referencing
2021-07-10 billymg << nice, looking at this node helped me identify a bug in my crawler
2021-07-08 billymg asciilifeform: the peer lists have been captured (the crawler is now stores probe history up to N probes, as set in conf). i'll dump the results somewhere permanent before the cap is reached
2021-07-08 billymg asciilifeform: it's consistent across all trb nodes that my crawler has picked up, and all in the last 1-2 hours (first time i've observed it since running this thing)
2021-07-08 billymg asciilifeform: is your node? my crawler is showing that sometime in the last hour or so it jumped from around ~40 connected peers (mostly good) to ~1200 peers (mostly fake/spam)
2021-06-30 billymg i'm also looking at as a potential library for rendering charts/graphs on the crawler www, in case anyone has experience with either, or has other recommendations
2021-06-29 billymg asciilifeform: i'm also getting close to making the crawler site live, at least a basic version so that others can take a look and provide feedback. at that point i think i'll need to upgrade from my rk to a bigger rig, especially since i also want to run whaack's block explorer on there
2021-05-19 billymg trinque: fwiw my working on the btc network crawler and new www to display obvious centralization of network is to attract more hands
2021-05-09 billymg asciilifeform: from there if you added some watchglass methods for getting blocks i could then incorporate those into the crawler (if that is what you meant by trying to analyze block propagation)
2021-05-09 billymg as soon as that's up will publish genesis for the crawler portion
2021-05-09 billymg my goal in making this crawler is to get more "bitcoiners" running trb nodes, and i suspect some al gore / nate silver stats and infograffix will widen the pool of those who see there is a problem
2021-05-09 billymg the ~8500 "actual" nodes number seems to be inline with what heathen trackers report as the total number of nodes on the network (~9k), so i suspect the crawler is nearly complete in mapping out reachable nodes
2021-05-09 billymg good morning, asciilifeform. my crawler seems to have hit a spam vein on the network, total unique IPs in the db exploded over the last two days to over 100k (these all come only from what a node returns in a 'getaddr' request). of those, when subsequently interrogated, only 8580 respond with a valid version message
2021-05-07 billymg i initially looked at it with the idea of repurposing for my crawler, barfed at 1001 dependencies pulled in, then remembered, "hey, watchglass does this"
2021-05-07 billymg the updated version of the crawler has been humming along nicely since last night, it's now up to ~4900 nodes discovered (heathen sites report over 9000)
2021-05-05 billymg << this wasn't even on my mp-wp todo list but since using postgres for the crawler it's now jumped near the top
2021-05-04 billymg from here i'm just going to proceed with making the crawler send the correct version message depending on whether it's trying to reach a trb or prb node. but yes, perhaps could write it so it tries once with '99999' then tries with '70001' and records result
2021-05-03 billymg asciilifeform: ah, interesting. now i'm wondering what the crawler could do to coax a node into sending a 'heathen command' in a reasonable amount of time
2021-05-03 billymg asciilifeform: i think whaack was working on a replacement block explorer. this thing i'm building is much simpler, just a network crawler
2021-05-03 billymg my reason for doing so was because both bitnodes and stopped tracking trb with their crawlers. i used to be able to check from time to time to see how many trb nodes are out there
2021-05-03 billymg asciilifeform: i wrote a simple btc network crawler that uses watchglass for node probing and dumps the results into a postgres db. it's been running since yesterday afternoon, here are some stats so far: