Commit Graph

1258 Commits

Author SHA1 Message Date
Tolga Ceylan
820baf36dc fn: clean api tests: removed multi log (#801)
fn-test-utils covers this, with sleep in between.
2018-02-27 21:03:03 -08:00
CI
b1827b5bfb fnserver: 0.3.347 release [skip ci] 2018-02-27 22:38:44 +00:00
Tolga Ceylan
222e06f5f5 fn: add fetch API error code helper function (#800) 2018-02-27 14:30:58 -08:00
CI
98df0aba94 fnserver: 0.3.346 release [skip ci] 2018-02-27 20:25:41 +00:00
Tolga Ceylan
320b766a6d fn: introduce agent config and minor ghostreader tweak (#797)
* fn: introduce agent config and minor ghostreader tweak

TODO: move all constants/tweaks in agent to agent config.

* fn: json convention
2018-02-27 12:17:13 -08:00
Tolga Ceylan
46fad7ef80 fn: plumb up I/O errors from docker wait (#798)
Reed Allman <rdallman10@gmail.com>'s I/O error fix.
2018-02-27 12:17:02 -08:00
CI
67c5c8bcaf fnserver: 0.3.345 release [skip ci] 2018-02-27 18:38:13 +00:00
Reed Allman
a56d204450 fix up response headers (#788)
* fix up response headers

* stops defaulting to application/json. this was something awful, go stdlib has
a func to detect content type. sadly, it doesn't contain json, but we can do a
pretty good job by checking for an opening '{'... there are other fish in the
sea, and now we handle them nicely instead of saying it's a json [when it's
not]. a test confirms this, there should be no breakage for any routes
returning a json blob that were relying on us defaulting to this format
(granted that they start with a '{').
* buffers output now to a buffer for all protocol types (default is no longer
left out in the cold). use a little response writer so that we can still let
users write headers from their functions. this is useful for content type
detection instead of having to do it in multiple places.
* plumbs the little content type bit into fn-test-util just so we can test it,
we don't want to put this in the fdk since it's redundant.

I am totally in favor of getting rid of content type from the top level json
blurb. it's redundant, at best, and can have confusing behaviors if a user
uses both the headers and the content_type field (we override with the latter,
now). it's client protocol specific to http to a certain degree, other
protocols may use this concept but have their own way to set it (like http
does in headers..). I realize that it mostly exists because it's somewhat gross
to have to index a list from the headers in certain languages more than
others, but with the ^ behavior, is it really worth it?

closes #782

* reset idle timeouts back

* move json prefix to stack / next to use
2018-02-27 10:30:33 -08:00
CI
e981f7f3d7 fnserver: 0.3.344 release [skip ci] 2018-02-27 09:17:14 +00:00
Tolga Ceylan
8b65ae8f9a fn: add docker command info to retry when logging errors (#795) 2018-02-27 01:10:07 -08:00
Tolga Ceylan
95d64f3aa9 fn: minor test improvements (#794) 2018-02-26 16:10:40 -07:00
CI
5551d6318a fnserver: 0.3.343 release [skip ci] 2018-02-26 18:31:43 +00:00
CI
540cc71899 fnserver: 0.3.342 release [skip ci] 2018-02-20 23:51:12 +00:00
Travis Reeder
575e1d3d0c Removes "type" from json format. Was pointless. (#783) 2018-02-20 12:04:08 -08:00
CI
cebae57007 fnserver: 0.3.341 release [skip ci] 2018-02-20 18:46:43 +00:00
Reed Allman
c0df9496a7 reduce allocs in getSlotQueueKey (#778)
this somewhat minimally comes up in profiling, but it was an itch i needed to
scratch. this does 10x less allocations and is 3x faster (with 3x less bytes),
and they're the small painful kind of allocation. we're only reading these
strings so the uses of unsafe are fine (I think audit me).  the byte array
we're casting to a string at the end is also heap allocated and does
escape. I only count 2 allocations, but there's 3 (`hash.Sum` and
`make([]string)`), using a pool of sha1 hash.Hash shaves 120 byte and an alloc
off so seems worth it (it's minimal). if we set a max size of config vals with
a constant we could avoid that allocation and we could probably find a
checksum package that doesn't use the `hash.Hash` that would speed things up a
little (no dynamic dispatch, doesn't allocate in Sum) but there's not one I
know of in stdlib.

master:
```
✗: go test -run=yodawg -bench . -benchmem -benchtime 1s -cpuprofile cpu.out
goos: linux
goarch: amd64
pkg: github.com/fnproject/fn/api/agent
BenchmarkSlotKey          200000              6068 ns/op             696 B/op         31 allocs/op
PASS
ok      github.com/fnproject/fn/api/agent       1.454s
```

now:
```
✗: go test -run=yodawg -bench . -benchmem -benchtime 1s -cpuprofile cpu.out
goos: linux
goarch: amd64
pkg: github.com/fnproject/fn/api/agent
BenchmarkSlotKey         1000000              1901 ns/op             168 B/op          3 allocs/op
PASS
ok      github.com/fnproject/fn/api/agent       2.092s
```

once we have versioned apps/routes we don't need to build a sha or sort
configs so this will get a lot faster.

anyway, mostly funsies here... my life is that sad now.
2018-02-16 11:39:10 -08:00
CI
71ee0eb860 fnserver: 0.3.340 release [skip ci] 2018-02-16 04:21:36 +00:00
Tolga Ceylan
af1ea0fa95 fn: ui no longer uses /stats (#776)
Decommission /stats related code.
2018-02-15 16:05:59 -08:00
CI
d80d4432dd fnserver: 0.3.339 release [skip ci] 2018-02-15 01:05:20 +00:00
Reed Allman
04ae223a5d fixup json,http protocols (#772)
* http now buffers the entire request body from the container before copying
it to the response writer (and sets content length). this is a level of sad i
don't feel comfortable talking about but it is what it is.
* json protocol was buffering the entire body so there wasn't any reason for
us to try to write this directly to the container stdin manually, we needed to
add a bufio.Writer around it anyway it was making too many write(fd) syscalls
with the way it was. this is just easier overall and has the same performance
as http now in my tests, whereas previously this was 50% slower [than http].
* add buffer pool for http & json to share/use. json doesn't create a new
buffer every stinkin request. we need to plumb down content length so that we
can properly size the buffer for json, have to add header size and everything
together but it's probably faster than malloc(); punting on properly sizing.
* json now sets content type to the length of the body from the returned json
blurb from the container

this does not handle imposing a maximum size of the response returned from a
container, which we need to add, but this has been open for some time
(specifically, on json). we can impose this by wrapping the pipes, but there's
some discussion to be had for json specifically we won't be able to just cut
off the output stream and use that (http we can do this). anyway, filing a
ticket...

closes #326 :(((((((
2018-02-14 14:06:36 -08:00
Reed Allman
9cbe4ea536 add pprof endpoints, additional spans (#770)
i would split this commit in two if i were a good dev.

the pprof stuff is really useful and this only samples when called. this is
pretty standard go service stuff. expvar is cool, too.

the additional spannos have turned up some interesting tid bits... gonna slide
em in
2018-02-13 20:01:41 -08:00
CI
61f4fe2e24 fnserver: 0.3.338 release [skip ci] 2018-02-14 03:54:25 +00:00
Tolga Ceylan
c132cf1825 fn: dind SIGINT and SIGCHLD changes (#771)
1) in dind, prevent SIGINT reaching to dockerd. This kills
docker and prevents shutdown as fn server is trying to stop.
2) as init process, always reap child processes.
2018-02-13 19:46:53 -08:00
CI
f01b502bc7 fnserver: 0.3.337 release [skip ci] 2018-02-14 03:06:43 +00:00
Reed Allman
1a1250e5ea disable fail whale logs (#768)
we have been getting these from attach all this time and never needed these
anyway.

I ran cpu profiles of dockerd and this was 90% of docker cpu usage (json
logs). woot. this will reduce i/o quite a bit, and we don't have to worry
about them taking up any disk space either.

from tests i get about 50% speedup with these off. the hunt continues...
2018-02-13 17:45:11 -08:00
CI
eebb9ae4e7 fnserver: 0.3.336 release [skip ci] 2018-02-13 19:35:03 +00:00
Reed Allman
f287ad274e support deeper / nesting of image names (#765)
closes #764
2018-02-13 11:26:28 -08:00
CI
46caee5815 fnserver: 0.3.335 release [skip ci] 2018-02-13 02:53:29 +00:00
Reed Allman
cbfd659e7e cap docker retries to fixed number (#762)
previously we would retry infinitely up to the context with some backoff in
between. for hot functions, since we don't set any dead line on pulling or
creating the image, this means it would retry forever without making any
progress if e.g. the registry is inaccessable or any other temporary error
that isn't actually temporary.  this adds a hard cap of 10 retries, which
gives approximately 13s if the ops take no time, still respecting the context
deadline enclosed.

the case where this was coming up is now tested for and was otherwise
confusing for users to debug, now it spits out an ECONNREFUSED with the
address of the registry, which should help users debug without having to poke
around fn logs (though I don't like this as an excuse, not all users will be
operators at some point in the near future, and this one makes sense)

closes #727
2018-02-12 18:45:30 -08:00
CI
726f615a03 fnserver: 0.3.334 release [skip ci] 2018-02-13 01:59:10 +00:00
Reed Allman
97194b3d8b return bad function http resp error (#728)
* return bad function http resp error

this was being thrown into the fn server logs but it's relatively easy to get
this to crop up if a function user forgets that they left a `println` laying
around that gets written to stdout, it garbles the http (or json, in its case)
output and they just see 'internal server error'. for certain clients i could
see that we really do want to keep this as 'internal server error' but for
things like e.g. docker image not authorized we're showing that in the
response, so this seems apt.

json likely needs the same treatment, will file a bug.

as always, my error messages are rarely helpful enough, help me please :)

closes #355

* add formatting directive

* fix up http error

* output bad jasons to user

closes #729

woo
2018-02-12 17:51:45 -08:00
CI
9d3b66d807 fnserver: 0.3.333 release [skip ci] 2018-02-13 00:00:27 +00:00
Tolga Ceylan
567136cb5e fn: required docker version fix (#759) 2018-02-12 15:53:05 -08:00
CI
ab77223d05 fnserver: 0.3.332 release [skip ci] 2018-02-12 22:18:51 +00:00
Tolga Ceylan
c848fc6181 fn: hot container timer improvements (#751)
* fn: hot container timer improvements

With this change, now we are allocating the timers
when the container starts and managing them via
stop/clear as needed, which should not only be more
efficient, but also easier to follow.

For example, previously, if eject time out was
set to 10 secs, this could have delayed idle timeout
up to 10 secs as well. It is also not necessary to do
any math for elapsed time.

Now consumers avoid any requeuing when startDequeuer() is cancelled.
This was triggering additional dequeue/requeue causing
containers to wake up spuriously. Also in startDequeuer(),
we no longer remove the item from the actual queue and
leave this to acquire/eject, which side steps issues related
with item landing in the channel, not consumed, etc.
2018-02-12 14:12:03 -08:00
CI
ffcda9b823 fnserver: 0.3.331 release [skip ci] 2018-02-12 18:42:21 +00:00
Tolga Ceylan
b2c95410f4 fn: test case additions (#755)
1) oom test
2) invalid http resp code test
3) check for error string contents in various error cases
2018-02-12 10:34:35 -08:00
CI
a2aad73664 fnserver: 0.3.330 release [skip ci] 2018-02-12 18:27:15 +00:00
CI
6f3237585d fnserver: 0.3.329 release [skip ci] 2018-02-09 21:30:52 +00:00
Reed Allman
27179ddf54 plumb ctx for container removal spanno (#750)
these were just dangling off on the side, took some plumbing work but not so
bad
2018-02-08 22:48:23 -08:00
CI
aea3bab95e fnserver: 0.3.328 release [skip ci] 2018-02-09 01:23:22 +00:00
Reed Allman
3ab49d4701 limit log size in containers (#748)
closes #317

we could fiddle with this, but we need to at least bound these. this
accomplishes that. 1m is picked since that's our default max log size for the
time being per call, it also takes a little time to generate that many bytes
through logs, typically (i.e. without trying to). I tested with 0, which
spiked the i/o rate on my machine because it's constantly deleting the json
log file. I also tested with 1k and it was similar (for a task that generated
about 1k in logs quickly) -- in testing, this halved my throughput, whereas
using 1m did not change the throughput at all. trying the 'none' driver and
'syslog' driver weren't great, 'none' turns off all stderr and 'syslog' blocks
every log line (boo). anyway, this option seems to have no affect on the
output we get in 'attach', which is what we really care about (i.e. docker is
not logically capping this, just swapping out the log file).

using 1m for this, e.g. if we have 500 hot containers on a machine we have
potentially half a gig of worthless logs laying around. we don't need the
docker logs laying around at all really, but short of writing a storage driver
ourselves there don't seem to be too many better options. open to idears, but
this is likely to hold us over for some time.
2018-02-08 17:16:26 -08:00
CI
6c62bdb18a fnserver: 0.3.327 release [skip ci] 2018-02-08 01:29:10 +00:00
Tolga Ceylan
f27d47f2dd Idle Hot Container Freeze/Preempt Support (#733)
* fn: freeze/unfreeze and eject idle under resource contention
2018-02-07 17:21:53 -08:00
CI
105947d031 fnserver: 0.3.326 release [skip ci] 2018-02-08 00:56:39 +00:00
Tolga Ceylan
dc4d90432b fn: memory limit adjustments (#746)
1) limit kernel memory which was previously unlimited, using
   same limits as user memory for a unified approach.
2) disable swap memory for containers
2018-02-07 16:48:52 -08:00
CI
8f70b622cc fnserver: 0.3.325 release [skip ci] 2018-02-07 00:24:19 +00:00
Tolga Ceylan
ebc6657071 fn: docker version check2 (#744)
1) now required docker version is 17.06
2) enable circle ci latest docker install
3) docker driver & agent check minimum version before start
2018-02-06 16:16:40 -08:00
CI
640a47fe55 fnserver: 0.3.324 release [skip ci] 2018-02-06 00:23:02 +00:00
CI
4d802acc83 fnserver: 0.3.323 release [skip ci] 2018-02-05 20:00:05 +00:00