* Docker stats to Prometheus
* Fix compilation error in docker_test
* Refactor docker driver Run function to wait for the container to have stopped before stopping the colleciton of statistics
* Fix go fmt errors
* Updates to sending docker stats to Prometheus
* remove new test TestWritResultImpl because we changes to support multiple waiters have been removed
* Update docker.Run to use channels not contextrs to shut down stats collector
* wip
* wip
* Added more fields to JSON and added blank line between objects.
* Update tests.
* wip
* Updated to represent recent discussions.
* Fixed up the json test
* More docs
* Changed from blank line to bracket, newline, open bracket.
* Blank line added back, easier for delimiting.
our dear friend mr. funclogger was bypassing calls to our multi writer since
we were embedding a *bytes.Buffer, it was using ReadFrom and WriteString which
would never call the stderr logger's Write method (or, as I learned, other
things trying to wrap that buffer's Write method...).
the tl;dr is many times DEBUG lines don't get spat out, from async tasks
especially (few people using this).
I think the final solution is probably to make funclogger a 'more robust'
interface that we understand instead of trying to minimize it to an
io.ReaderWriterCloser, much like how bytes.Buffer has all kinds of
methods implemented on it, we can implement things like ReadFrom and
WriteString most likely. not a big fan of how things are now (and it's my own
doing) with the readerwritercloser coming from multiple places but meh,
will get to it some day soon, the log stuff will be a pretty hot path.
this change makes Dispatch write request body and
http headers directly to pipe one by one
in case of non-empty request body,
if not - write headers and close finalize JSON
What's new?
- better error handling
- still need to decode JSON from function because we need status code and body
- prevent request body to be a problem by deferring its close
- moving examples around: putting http and json samples into one folder
go vet caught some nifty bugs. so fixed those here, and also made it so that
we vet everything from now on since the robots seem to do a better job of
vetting than we have managed to.
also adds gofmt check to circle. could move this to the test.sh script (didn't
want a script calling a script, because $reasons) and it's nice and isolated
in its own little land as it is. side note, changed the script so it runs in
100ms instead of 3s, i think find is a lot faster than go list.
attempted some minor cleanup of various scripts
* idle_timeout max of 1h
* timeout max of 120s for sync, 1h for async
* max memory of 8GB
* do full route validation before call invocation
* ensure that idle_timeout >= timeout
we are now doing validation of updating route inside of the database
transaction, which is what we should have been doing all along really.
we need this behavior to ensure that the idle timeout is longer than the
timeout, among other benefits (like not updating the most recent version of
the existing struct and overwriting previous updates, yay). since we have
this, we can get rid of the weird skipZero behavior on validate too and
validate the real deal holyfield.
validating the route before making the call is handy so that we don't do weird
things like run a func that wants to use 300GB of RAM and run for 3 weeks.
closes#192closes#344closes#162
I'd be pretty surprised if these were happening but meh, a computer running at
capacity can make the runtime scheduler do all kinds of weird shit, so this
locks down the behavior around slot launching.
I didn't load test much as there are cries of 'wolf' running amok, and it's
late, so this could be off a little -- but I think it's about this easy. cold
is the only one launching slots for itself, so it should always receive its
own slot (provided within time bounds). for hot we just need a way to tell the
ram token allocator that we aren't there anymore, so that somebody can close
the token (important).
If the bug still persists then it seems likely that there is another bug
around timing I'm not aware of (possible, but unlikely) or the more likely
case that it's actually taking up to the timeout to launch a container / find
a ram slot / find a free container. Otherwise, it's not related to the agent
and the http server timeouts may need fiddling with (read / write timeout),
if ruby client is failing to connect though I'm guessing that it's just that
nobody is reading the body (i.e. no function runs) and the error handling
isn't very well done, as we are replying with 504 if we hit a timeout (but if
nobody is listening, they won't get it).