Commit Graph

10 Commits

Author SHA1 Message Date
Reed Allman
c0df9496a7 reduce allocs in getSlotQueueKey (#778)
this somewhat minimally comes up in profiling, but it was an itch i needed to
scratch. this does 10x less allocations and is 3x faster (with 3x less bytes),
and they're the small painful kind of allocation. we're only reading these
strings so the uses of unsafe are fine (I think audit me).  the byte array
we're casting to a string at the end is also heap allocated and does
escape. I only count 2 allocations, but there's 3 (`hash.Sum` and
`make([]string)`), using a pool of sha1 hash.Hash shaves 120 byte and an alloc
off so seems worth it (it's minimal). if we set a max size of config vals with
a constant we could avoid that allocation and we could probably find a
checksum package that doesn't use the `hash.Hash` that would speed things up a
little (no dynamic dispatch, doesn't allocate in Sum) but there's not one I
know of in stdlib.

master:
```
✗: go test -run=yodawg -bench . -benchmem -benchtime 1s -cpuprofile cpu.out
goos: linux
goarch: amd64
pkg: github.com/fnproject/fn/api/agent
BenchmarkSlotKey          200000              6068 ns/op             696 B/op         31 allocs/op
PASS
ok      github.com/fnproject/fn/api/agent       1.454s
```

now:
```
✗: go test -run=yodawg -bench . -benchmem -benchtime 1s -cpuprofile cpu.out
goos: linux
goarch: amd64
pkg: github.com/fnproject/fn/api/agent
BenchmarkSlotKey         1000000              1901 ns/op             168 B/op          3 allocs/op
PASS
ok      github.com/fnproject/fn/api/agent       2.092s
```

once we have versioned apps/routes we don't need to build a sha or sort
configs so this will get a lot faster.

anyway, mostly funsies here... my life is that sad now.
2018-02-16 11:39:10 -08:00
Tolga Ceylan
c848fc6181 fn: hot container timer improvements (#751)
* fn: hot container timer improvements

With this change, now we are allocating the timers
when the container starts and managing them via
stop/clear as needed, which should not only be more
efficient, but also easier to follow.

For example, previously, if eject time out was
set to 10 secs, this could have delayed idle timeout
up to 10 secs as well. It is also not necessary to do
any math for elapsed time.

Now consumers avoid any requeuing when startDequeuer() is cancelled.
This was triggering additional dequeue/requeue causing
containers to wake up spuriously. Also in startDequeuer(),
we no longer remove the item from the actual queue and
leave this to acquire/eject, which side steps issues related
with item landing in the channel, not consumed, etc.
2018-02-12 14:12:03 -08:00
Reed Allman
27179ddf54 plumb ctx for container removal spanno (#750)
these were just dangling off on the side, took some plumbing work but not so
bad
2018-02-08 22:48:23 -08:00
Tolga Ceylan
97d78c584b fn: better slot/container/request state tracking (#719)
* fn: better slot/container/request state tracking
2018-01-26 12:21:11 -08:00
Tolga Ceylan
8c31e47c01 fn: agent slot improvements (#704)
*) Stopped using latency previous/current stats, this
was not working as expected. Fresh starts usually have
these stats zero for a long time, and initial samples
are high due to downloads, caches, etc.

*) New state to track: containers that are idle. In other
words, containers that have an unused token in the slot
queue.

*) Removed latency counts since these are not used in
container start decision anymore. Simplifies logs.

*) isNewContainerNeeded() simplified to use idle count
to estimate effective waiters. Removed speculative
latency based logic and progress check comparison.
In agent, waitHot() delayed signalling compansates
for these changes. If the estimation may fail, but
this should correct itself in the next 200 msec
signal.
2018-01-19 12:35:52 -08:00
Tolga Ceylan
2f0de2b574 fn: resource and slot cancel and broadcast improvements (#696)
* fn: resource and slot cancel and broadcast improvements

*) Context argument does not wake up the waiters correctly upon
cancellation/timeout.
*) Avoid unnecessary broadcasts in slot and resource.

* fn: limit scope of context in resource/slot calls in agent
2018-01-18 13:43:56 -08:00
Tolga Ceylan
5a7778a656 fn: cancellations in WaitAsyncResource (#694)
* fn: cancellations in WaitAsyncResource

Added go context with cancel to wait async resource. Although
today, the only case for cancellation is shutdown, this cleans
up agent shutdown a little bit.

* fn: locked broadcast to avoid missed wake-ups

* fn: removed ctx arg to WaitAsyncResource and startDequeuer

This is confusing and unnecessary.
2018-01-17 16:08:54 -08:00
Tolga Ceylan
1c8029e4f1 fn: more tests for hot container launch logic (#678) 2018-01-11 16:00:37 -08:00
Tolga Ceylan
db159e595f fn: new container lauch adjustments (#677)
*) revert executor wait queue size comparison. This is too
   aggresive and with stall check below, now unnecessary.
*) new container logic now checks if stats are constant, if
   this is the case, then we assume the system is stalled (eg
   running functions that take long time), this means we need
   to make progress and spin up a new container.
2018-01-11 14:09:21 -08:00
Tolga Ceylan
14789aba41 Slot mgr fixes (#613)
*) during shutdown, errors should be 503
*) new inactivity time out for hot queue, we previously kept hot queues in memory forever.
*) each hot queue now has a hot launcher to monitor and launch hot containers
*) consumers now create a consumer channel with startDequeuer() that can be cancelled via context
*) consumers now ping (signal) hot launcher every 200 msecs until they get a slot
*) tests for slot queue & mgr
2018-01-04 11:34:43 -08:00