This is prep-work for more tests to come.
*) remove http response -1, this will break in go 1.10
*) add docker id & hostname to fn-test-utils (will be useful
to check/test which instance a request landed on.)
*) add container start/stop logs in fn-test-utils. To detect
if/how we miss logs during container start & end.
while escape analysis didn't lie that the bytes underlying this string escaped
to the heap, the reference to them died and led to us getting an undefined
byte array underlying the string.
sadly, this makes 4 allocs here (still down from 31), but only adds 100ns per
op. I still don't get why 'buf' and 'byts' escape to the heap, blaming faulty
escape analysis code.
this one is kind of impossible to write a test for.
found this from doing benchmarking stuff and was getting weird behavior at the
end of runs where calls didn't find a slot, ran bisect on a known-good commit
from a couple weeks ago and found that it was this. voila. this could explain
the variance from the slack dude's benchmarks, too. anyway, confirmed that
this fixes the issue.
* push down app listeners to a datastore
fnext.NewDatastore returns a datastore that wraps the appropriate methods for
AppListener in a Datastore implementation. this is more future proof than
needing to wrap every call of GetApp/UpdateApp/etc with the listeners, there
are a few places where this can happen and it seems like the AppListener
behavior is supposed to wrap the datastore, not just the front end methods
surrounding CRUD ops on an app. the hairy case that came up was when fiddling
with the create/update route business.
this changes the FireBeforeApp* ops to be an AppListener implementation itself
rather than having the Server itself expose certain methods to fire off the
app listeners, now they're on the datastore itself, which the server can
return the instance of.
small change to BeforeAppDelete/AfterAppDelete -- we were passing in a half
baked struct with only the name filled in and not filling in the fields
anywhere. this is mostly just misleading, we could fill in the app, but we
weren't and don't really want to, it's more to notify of an app deletion event
so that an extension can behave accordingly instead of letting a user inspect
the app. i know of 3 extensions and the changes required to update are very
small.
cleans up all the front end implementations FireBefore/FireAfter.
this seems potentially less flexible than previous version if we do want to
allow users some way to call the database methods without using the
extensions, but that's exactly the trade off, as far as the AppListener's are
described it seems heavily implied that this should be the case.
mostly a feeler, for the above reasons, but this was kind of odorous so just
went for it. we do need to lock in the extension api stuff.
* hand em an app that's been smokin the reefer
* fix up response headers
* stops defaulting to application/json. this was something awful, go stdlib has
a func to detect content type. sadly, it doesn't contain json, but we can do a
pretty good job by checking for an opening '{'... there are other fish in the
sea, and now we handle them nicely instead of saying it's a json [when it's
not]. a test confirms this, there should be no breakage for any routes
returning a json blob that were relying on us defaulting to this format
(granted that they start with a '{').
* buffers output now to a buffer for all protocol types (default is no longer
left out in the cold). use a little response writer so that we can still let
users write headers from their functions. this is useful for content type
detection instead of having to do it in multiple places.
* plumbs the little content type bit into fn-test-util just so we can test it,
we don't want to put this in the fdk since it's redundant.
I am totally in favor of getting rid of content type from the top level json
blurb. it's redundant, at best, and can have confusing behaviors if a user
uses both the headers and the content_type field (we override with the latter,
now). it's client protocol specific to http to a certain degree, other
protocols may use this concept but have their own way to set it (like http
does in headers..). I realize that it mostly exists because it's somewhat gross
to have to index a list from the headers in certain languages more than
others, but with the ^ behavior, is it really worth it?
closes#782
* reset idle timeouts back
* move json prefix to stack / next to use
a lot of the logging tests use async calls which was flaky with timings. there
are a couple left that use async, that we need to test, but this reduces the
probability of the flaky tests flaking to 2/5 instead of 5/5.
the behaviors we are testing here we should not be testing through the api,
for this exact reason, we can test a ton of this in the agent itself and not
have to rely on timings. not taking that here, but this fixes circle ci
flaking out for a few days.
this somewhat minimally comes up in profiling, but it was an itch i needed to
scratch. this does 10x less allocations and is 3x faster (with 3x less bytes),
and they're the small painful kind of allocation. we're only reading these
strings so the uses of unsafe are fine (I think audit me). the byte array
we're casting to a string at the end is also heap allocated and does
escape. I only count 2 allocations, but there's 3 (`hash.Sum` and
`make([]string)`), using a pool of sha1 hash.Hash shaves 120 byte and an alloc
off so seems worth it (it's minimal). if we set a max size of config vals with
a constant we could avoid that allocation and we could probably find a
checksum package that doesn't use the `hash.Hash` that would speed things up a
little (no dynamic dispatch, doesn't allocate in Sum) but there's not one I
know of in stdlib.
master:
```
✗: go test -run=yodawg -bench . -benchmem -benchtime 1s -cpuprofile cpu.out
goos: linux
goarch: amd64
pkg: github.com/fnproject/fn/api/agent
BenchmarkSlotKey 200000 6068 ns/op 696 B/op 31 allocs/op
PASS
ok github.com/fnproject/fn/api/agent 1.454s
```
now:
```
✗: go test -run=yodawg -bench . -benchmem -benchtime 1s -cpuprofile cpu.out
goos: linux
goarch: amd64
pkg: github.com/fnproject/fn/api/agent
BenchmarkSlotKey 1000000 1901 ns/op 168 B/op 3 allocs/op
PASS
ok github.com/fnproject/fn/api/agent 2.092s
```
once we have versioned apps/routes we don't need to build a sha or sort
configs so this will get a lot faster.
anyway, mostly funsies here... my life is that sad now.
* http now buffers the entire request body from the container before copying
it to the response writer (and sets content length). this is a level of sad i
don't feel comfortable talking about but it is what it is.
* json protocol was buffering the entire body so there wasn't any reason for
us to try to write this directly to the container stdin manually, we needed to
add a bufio.Writer around it anyway it was making too many write(fd) syscalls
with the way it was. this is just easier overall and has the same performance
as http now in my tests, whereas previously this was 50% slower [than http].
* add buffer pool for http & json to share/use. json doesn't create a new
buffer every stinkin request. we need to plumb down content length so that we
can properly size the buffer for json, have to add header size and everything
together but it's probably faster than malloc(); punting on properly sizing.
* json now sets content type to the length of the body from the returned json
blurb from the container
this does not handle imposing a maximum size of the response returned from a
container, which we need to add, but this has been open for some time
(specifically, on json). we can impose this by wrapping the pipes, but there's
some discussion to be had for json specifically we won't be able to just cut
off the output stream and use that (http we can do this). anyway, filing a
ticket...
closes#326 :(((((((
i would split this commit in two if i were a good dev.
the pprof stuff is really useful and this only samples when called. this is
pretty standard go service stuff. expvar is cool, too.
the additional spannos have turned up some interesting tid bits... gonna slide
em in
1) in dind, prevent SIGINT reaching to dockerd. This kills
docker and prevents shutdown as fn server is trying to stop.
2) as init process, always reap child processes.
we have been getting these from attach all this time and never needed these
anyway.
I ran cpu profiles of dockerd and this was 90% of docker cpu usage (json
logs). woot. this will reduce i/o quite a bit, and we don't have to worry
about them taking up any disk space either.
from tests i get about 50% speedup with these off. the hunt continues...
This PR changes the instructions for running Prometheus and Grafana in Docker to use the `--add-host` parameter instead of the `--link` parameter which is deprecated and which has been reported by @shaunsmith to sometimes not work.
The supplied command has been verified by @shaunsmith on Mac and Linux and by me on Linux.
previously we would retry infinitely up to the context with some backoff in
between. for hot functions, since we don't set any dead line on pulling or
creating the image, this means it would retry forever without making any
progress if e.g. the registry is inaccessable or any other temporary error
that isn't actually temporary. this adds a hard cap of 10 retries, which
gives approximately 13s if the ops take no time, still respecting the context
deadline enclosed.
the case where this was coming up is now tested for and was otherwise
confusing for users to debug, now it spits out an ECONNREFUSED with the
address of the registry, which should help users debug without having to poke
around fn logs (though I don't like this as an excuse, not all users will be
operators at some point in the near future, and this one makes sense)
closes#727
* return bad function http resp error
this was being thrown into the fn server logs but it's relatively easy to get
this to crop up if a function user forgets that they left a `println` laying
around that gets written to stdout, it garbles the http (or json, in its case)
output and they just see 'internal server error'. for certain clients i could
see that we really do want to keep this as 'internal server error' but for
things like e.g. docker image not authorized we're showing that in the
response, so this seems apt.
json likely needs the same treatment, will file a bug.
as always, my error messages are rarely helpful enough, help me please :)
closes#355
* add formatting directive
* fix up http error
* output bad jasons to user
closes#729
woo