Commit Graph

61 Commits

Author SHA1 Message Date
Travis Reeder
d7bf64bf66 Big dependency update, all lowercase sirupsen's for all dependencies. 2017-08-23 19:52:56 -07:00
Reed Allman
6a7973e6b6 plumb all config fields into task
the mqs are storing a models.Task, which was not incorporating all the fields
that are in a task.Config. I would very much like to merge these two things,
but expect to do this in a future restructuring as both are used widely and
not cordoned off properly (Config has a channel, stdin, stdout, stderr -- and
isn't just a 'config', so to speak, as Task is).

Since a task.Config is what is used to actually run a container, the result of
the aforementioned deficiency was #193 where tasks are improperly configured
and ran (namely, memory wrong).

async tasks can still not be hot, they will be reverted to default format.
would also like to fix this (also part of restructuring). I actually started
doing this, hence the changes to those files (the surface area of the change
is small and discourages improper future use, so I've left what I've done).

this will:

closes #193
closes #195
closes #154

removes many unused fields in models.Task, since we have not implemented
retries. priority & delay are left, even though they are not used either,
the main goal of this is to resolve #193 and both these fields are strongly
plumbed into all the mqs, so punting on those two.
2017-08-03 06:33:30 -07:00
Reed Allman
3ff28163db fix task memory
prior to this patch we were allowing 256MB for every function run, just
because that was the default for the docker driver and we were not using the
memory field on any given route configuration. this fixes that, now docker
containers will get the correct memory limit passed into the container from
the route. the default is still 128.

there is also an env var now, `MEMORY_MB` that is set on each function call,
see the linked issue below for rationale.

closes #186

ran the given function code from #186, and now i only see allocations up to
32MB before the function is killed. yay.

notes:

there is no max for memory. for open source fn i'm not sure we want to
cap it, really. in the services repo we probably should add a cap before prod.
since we don't know any given fn server's ram, we can't try to make sure the
setting on any given route is something that can even be run.

remove envconfig & bytefmt

this updates the glide.yaml file to remove the unused deps, but trying to
install fresh is broken atm so i couldn't remove from vendor/, going to fix
separately (next update we just won't get these). also changed the skip dir to
be the cli dir now that its name has changed (related to brokenness).

fix how ram slots were being allocated. integer division is significantly
slower than subtraction.
2017-08-02 19:09:16 -07:00
Denis Makogon
49fe3eb11a Fixing FMT errors
Do we run go-fmt in CI?
2017-07-31 21:14:11 +03:00
Travis Reeder
48e3781d5e Rename to GitHub (#3)
* circle

* Rename to github and fn->cli

*  Rename to github and fn->cli
2017-07-26 10:50:19 -07:00
Reed Allman
dc5e67b6d2 add opentracing spans for metrics 2017-07-25 08:55:22 -07:00
Travis Reeder
e56ac42bc2 Using ctx logger in more places to get more context in the logs - ie: call_id 2017-07-10 16:13:51 -07:00
Reed Allman
c20b4769bf make hot functions actually have logs now 2017-06-30 16:10:33 -07:00
Reed Allman
a0ec3024fd clean up the logging code
add limit writecloser, add closer method so we can flush logs properly,
buffer logs and stuff

it builds it works amirite
2017-06-11 17:39:08 -07:00
James
8a3edb8309 All of the changes for func logs 2017-06-19 11:38:11 -07:00
Reed Allman
75c5e83936 adds wait time based scaling across nodes
this works by having every request from the functions server kick back a
FXLB-WAIT header on every request with the wait time for that function to
start. the lb then keeps track on a per node+function basis an ewma of the
last 10 request's wait times (to reduce jitter).  now that we don't have max
concurrency it's actually pretty challenging to get the wait time stuff to
tick. i expect in the near future we will be throttling functions on a given
node in order to induce this, but that is for another day as that code needs a
lot of reworking. i tested this by introducing some arbitrary throttling (not
checked in) and load spreads over nodes correctly (see images). we will also
need to play with the intervals we want to use, as if you have a func with
50ms run time then basically 10 of those will rev up another node (this was
before removing max_c, with max_c=1) but in any event this wires in the basic
plumbing.

* make docs great again. renamed lb dir to fnlb
* added wait time to dashboard
* wires in a ready channel to await the first pull for hot images to count in
the wait time (should be otherwise useful)

future:
TODO rework lb code api to be pluggable + wire in data store
TODO toss out first data point containing pull to not jump onto another node
immediately (maybe this is actually a good thing?)
2017-06-09 16:30:34 -07:00
Reed Allman
9edacae928 clean up hotf(x) concurrency, rm max c
this patch gets rid of max concurrency for functions altogether, as discussed,
since it will be challenging to support across functions nodes. as a result of
doing so, the previous version of functions would fall over when offered 1000
functions, so there was some work needed in order to push this through.
further work is necessary as docker basically falls over when trying to start
enough containers at the same time, and with this patch essentially every
function can scale infinitely. it seems like we could add some kind of
adaptive restrictions based on task run length and configured wait time so
that fast running functions will line up to run in a hot container instead of
them all creating new hot containers.

this patch takes a first cut at whacking out some of the insanity that was the
previous concurrency model, which was problematic in that it limited
concurrency significantly across all functions since every task went through
the same unbuffered channel, which could create blocking issues for all
functions if the channel is not picked off fast enough (it's not apparent that
this was impossible in the previous implementation). in any event, each
request has a goroutine already, there's no reason not to use it. not too hard
to wrap a map in a lock, not sure what the benefits were (added insanity?) in effect
this is marginally easier to understand and less insane (marginally). after
getting rid of max c this adds a blocking mechanism for the first invocation
of any function so that all other hot functions will wait on the first one to
finish to avoid a herd issue (was making docker die...) -- this could be
slightly improved, but works in a pinch. reduced some memory usage by having
redundant maps of htfnsvr's and task.Requests (by a factor of 2!). cleaned up
some of the protocol stuff, need to clean this up further. anyway, it's a
first cut. have another patch that rewrites all of it but was getting into
rabbit hole territory, would be happy to oblige if anybody else has problems
understanding this rat's nest of channels. there is a good bit of work left to
make this prod ready (regardless of removing max c).

a warning that this will break the db schemas, didn't put the effort in to add
migration stuff since this isn't deployed anywhere in prod...

TODO need to clean out the htfnmgr bucket with LRU
TODO need to clean up runner interface
TODO need to unify the task running paths across protocols
TODO need to move the ram checking stuff into worker for noted reasons
TODO need better elasticity of hot f(x) containers
2017-06-05 20:04:13 -07:00
Chad Arimura
49d397293b global url replace 2017-05-29 17:10:47 -07:00
Travis Reeder
69f0201818 Some small cleanup to docs. 2017-05-26 18:54:26 +00:00
James
e4bb04887e Rewrite imports to use forks files on gitlab not use githubs. 2017-05-16 11:06:32 -07:00
Travis Reeder
4b9bba352d Rename location. 2017-05-15 11:00:15 -07:00
Travis Reeder
d0ca2f9228 Moved runner into this repo, update dep files and now builds. 2017-04-21 07:42:42 -07:00
Travis Reeder
615ae5c36f Mass s&r: iron-io -> kumokit 2017-04-19 09:49:12 -06:00
Travis Reeder
10f3178ae9 Switching to new dep tool (#616)
* making things work

* #506 - Add ability to login to a private docker registry

* Rolling back "make things work" to test them out more.

* Rolling back "make things work" to test them out more.

* credentials from docker/config.json if ENV is missing

* should get docker auth info just in the init

* update glide lock

* update glide

* Switched to new go dep tool, glide is too frikin annoying.

* Updated circle builds to use dep

* Added GOPATH/bin to path.

* Added GOPATH/bin to path.

* Using regular make test, instead of docker one (not sure why it was using the docker one?).
2017-04-07 11:22:08 -07:00
C Cirello
c48bd95fa6 server: stats endpoint (#468)
fixes #389
2017-01-03 21:39:29 +01:00
C Cirello
1dc3145045 functions: upgrade runner to latest (#434)
* functions: upgrade runner

* functions: update to latest runner

Supercedes and fixes #433
2016-12-14 00:10:24 +01:00
C Cirello
0cdd1db3e1 functions: fix goroutine leak in runner (#394)
* functions: fix goroutine leak in runner

* functions: ensure taskQueue is consumed after context cancellation
2016-12-06 16:11:06 +01:00
C Cirello
ac0044f7d9 functions: hot containers (#332)
* functions: modify datastore to accomodate hot containers support

* functions: protocol between functions and hot containers

* functions: add hot containers clockwork

* fn: add hot containers support
2016-11-28 15:45:35 -02:00
Pedro Nasser
867eb4b176 Changes on function/metric loggers (#343)
* initial fix logger

* dix DefaultFuncLogger

* fix runner and tests

* reverting: sending async task stdout to func logger
2016-11-27 16:36:40 -02:00
C Cirello
9d06b6e687 functions: common concurrency stream for sync and async (#314)
* functions: add bounded concurrency

* functions: plug runners to sync and async interfaces

* functions: update documentation about the new env var

* functions: fix test flakiness

* functions: the runner is self-regulated, no need to set a number of runners

* functions: push the execution to the background on incoming requests

* functions: ensure async tasks are always on

* functions: add prioritization to tasks consumption

Ensure that Sync tasks are consumed before Async tasks. Also, fixes
termination races problems for free.

* functions: remove stale comments

* functions: improve mem availability calculation

* functions: parallel run for async tasks

* functions: check for memory availability before pulling async task

* functions: comment about rnr.hasAvailableMemory and sync.Cond

* functions: implement memory check for async runners using Cond vars

* functions: code grooming

- remove unnecessary goroutines
- fix stale docs
- reorganize import group

* Revert "functions: implement memory check for async runners using Cond vars"

This reverts commit 922e64032201a177c03ce6a46240925e3d35430d.

* Revert "functions: comment about rnr.hasAvailableMemory and sync.Cond"

This reverts commit 49ad7d52d341f12da9603b1a1df9d145871f0e0a.

* functions: set a minimum memory availability for sync

* functions: simplify the implementation by removing the priority queue

* functions: code grooming

- code deduplication
- review waitgroups Waits
2016-11-18 18:23:26 +01:00
Carlos C
d5fb1afda7 Revert "Assert License (#224)"
This reverts commit a61c4dab78.
2016-11-06 09:25:12 -08:00
C Cirello
a61c4dab78 Assert License (#224)
* license: assert license for Go code
* license: add in shell scripts
* license: assert license for Ruby code
* license: assert license to individual cases
* license: assert license to Dockerfile
2016-11-05 23:33:07 +01:00
Nikhil Marathe
1397899358 Fix max memory on non-linux machines and memory decrement after failures.
* Always decrement memory even if task preparation or execution fails.

* Fall back to max 2GB memory on non-Linux. 300GB is ridiculous.

* Simplify loop
2016-10-31 17:33:04 -07:00
Travis Reeder
41c06644d9 Docs related to running in production. (#174)
* Fixed up api.md, removed Titan references.

* Adding more documentation on running in production.

* Update deps for ironmq.
2016-10-17 11:31:58 -07:00
Seif Lotfy سيف لطفي
064d597b60 Fix runner changes (#135)
* Upgrade iron-io/runner to 165c16a9

* fix support for Stdin to work
2016-10-07 21:17:40 +02:00
Pedro Nasser
52f78eb601 fix runner changes (#132)
Fix runner changes
2016-10-07 18:49:16 +02:00
Seif Lotfy سيف لطفي
52cab30056 Change PAYLOAD input to STDIN (#111)
* change to iron-io/runner dependency
* Fix runner dependency
* Change PAYLOAD input to STDIN, fixes #40
2016-10-06 18:44:58 -03:00
Seif Lotfy سيف لطفي
b7bf73f5d2 Makefile (#122)
* Update Readme and add Makefile
* Skip stale tests (in wait for stdin support)

* Revert "Skip stale tests (in wait for stdin support)"

This reverts commit 228da3776503f40ca53df70a79a9e4a9c73fd8b5.
2016-10-06 20:46:29 +02:00
C Cirello
3ca137a01c Upgrade to Go 1.7 (#128)
* Upgrade to stdlib context package
* Modernized syntax
2016-10-06 20:10:00 +02:00
Seif Lotfy سيف لطفي
fbcec6bf40 Depend on iron-io/runner instead of iron-io/worker (#124) 2016-10-05 20:42:12 +02:00
Henrique Chehad
ba3f0b360b removed "reserved_memory" metric 2016-09-21 22:35:07 -03:00
Henrique Chehad
6b910d0b75 added wait time total and reserved memory metrics 2016-09-21 20:25:37 -03:00
Henrique Chehad
06294b4b77 updated worker repository ref 2016-09-19 20:41:35 -03:00
Pedro Nasser
b867b20cfd fix sleep time 2016-09-17 12:11:04 -03:00
Pedro Nasser
853c8b4534 prevent zero memory requirement 2016-09-13 23:44:16 -03:00
Pedro Nasser
688a6a0718 invalid method 2016-09-13 23:40:30 -03:00
Pedro Nasser
89a4092dc1 merge with master 2016-09-13 23:25:06 -03:00
Pedro Nasser
da1746dc97 improvements 2016-09-13 23:22:00 -03:00
Pedro Nasser
e6d0079051 add IGNORE_MEMORY 2016-09-12 14:44:11 -03:00
Pedro Nasser
a98b7e25d0 metric logger 2016-09-12 11:46:21 -03:00
Pedro Nasser
81a3394317 add queue.full and timeout count 2016-09-09 01:03:27 -03:00
Pedro Nasser
5d50721db1 add initial queue to runner 2016-09-09 00:54:00 -03:00
Henrique Chehad
615b421dfa migrated EnsureUsableImage to EnsureImageExists 2016-08-30 11:08:54 -03:00
Pedro Nasser
6a2e9b29be update titan, other deps and minor changes 2016-08-24 16:11:21 -03:00
Henrique Chehad
148d52c890 updates after runner factored 2016-08-22 19:17:58 -03:00