Commit Graph

14 Commits

Author SHA1 Message Date
Tolga Ceylan
0105f8321e fn: stats view/distribution improvements (#1154)
* fn: stats view/distribution improvements

*) View latency distribution is now an argument
in view creation functions. This allows easier
override to set custom buckets. It is simplistic
and assumes all latency views would use the same
set, but in practice this is already the case.
*) Removed API view creation to main, this should not
be enabled for all node types. This is consistent with
the rest of the system.

* fn: Docker samples of cpu/mem/disk with specific buckets
2018-08-03 11:06:54 -07:00
Owen Cliffe
fff95e7992 Clean up/make consistent the APIs for registering core components, make Docker an optional component at compile time (#1111) 2018-07-07 10:37:19 +01:00
Peter Jausovec
bd5150f1ac Extract register view functionality (#1056)
* WIP

* Create separate Register*Views functions that are called from main.
2018-06-12 17:24:21 +01:00
Owen Cliffe
1ad27f4f0d Inverting deps on SQL, Log and MQ plugins to make them optional dependencies of extended servers, Removing some dead code that brought in unused dependencies Filtering out some non-linux transitive deps. (#1057)
* initial Db helper split - make SQL and datastore packages optional

* abstracting log store

* break out DB, MQ and log drivers as extensions

* cleanup

* fewer deps

* fixing docker test

* hmm dbness

* updating db startup

* Consolidate all your extensions into one convenient package

* cleanup

* clean up dep constraints
2018-06-11 18:23:28 +01:00
Gerardo Viedma
ea1f94253f Implement graceful shutdown of agent.DataAccess (#1008)
* Implements graceful shutdown of agent.DataAccess and underlying Datastore/Logstore/MessageQueue

* adds tests for closing agent.DataAccess and Datastore
2018-05-21 11:28:21 +01:00
jan grant
91e58afa55 The opencensus API changes between 0.6.0 and 0.9.0 (#980)
We get some useful features in later versions; update so as to not
pin downstream consumers (extensions) to an older version.
2018-05-09 14:55:00 +01:00
Reed Allman
56a2861748 move calls to logstore, implement s3 (#911)
* move calls to logstore, implement s3

closes #482

the basic motivation is that logs and calls will be stored with a very high
write rate, while apps and routes will be relatively infrequently updated; it
follows that we should likely split up their storage location, to back them
with appropriate storage facilities. s3 is a good candidate for ingesting
higher write rate data than a sql database, and will make it easier to manage
that data set. can read #482 for more detailed justification.

summary:

* calls api moved from datastore to logstore
* logstore used in front-end to serve calls endpoints
* agent now throws calls into logstore instead of datastore
* s3 implementation of calls api for logstore
* s3 logs key changed (nobody using / nbd?)
* removed UpdateCall api (not in use)
* moved call tests from datastore to logstore tests
* mock logstore now tested (prev. sqlite3 only)
* logstore tests run against every datastore (mysql, pg; prev. only sqlite3)
* simplify NewMock in tests

commentary:

brunt of the work is implementing the listing of calls in GetCalls for the s3
logstore implementation. the GetCalls API requires returning items in the
newest to oldest order, and the s3 api lists items in lexicographic order
based on created_at. An easy thing to do here seemed to be to reverse the
encoding of our id format to return a lexicographically descending order,
since ids are time based, reasonably encoded to be lexicographically
sortable, and de-duped (unlike created_at). This seems to work pretty well,
it's not perfect around the boundaries of to_time and from_time and a tiny
amount of results may be omitted, but to me this doesn't seem like a deal
breaker to get 6999 results instead of 7000 when trying to get calls between
3:00pm and 4:00pm Monday 3 weeks ago. Of course, without to_time and
from_time, there are no issues in listing results. We could use created at and
encode it, but it would be an additional marker for point lookup (GetCall)
since we would have to search for a created_at stamp, search for ids around
that until we find the matching one, just to do a point lookup. So, the
tradeoff here seems worth it. There is additional optimization around to_time
to seek over newer results (since we have descending order).

The other complication in GetCalls is returning a list of calls for a given
path. Since the keys to do point lookups are only app_id + call_id, and we
need listing across an app as well, this leads us to the 'marker' collection
which is sorted by app_id + path + call_id, to allow quick listing by path.
All in all, it should be pretty straightforward to follow the implementation
and I tried to be lavish with the comments, please let me know if anything
needs further clarification in the code.

The implementation itself has some glaring inefficiencies, but they're
relatively minute: json encoding is kinda lazy, but workable; s3 doesn't offer
batch retrieval, so we point look up each call one by one in get call; not
re-using buffers -- but the seeking around the keys should all be relatively
fast, not too worried about performance really and this isn't a hot path for
reads (need to make a cut point and turn this in!).

Interestingly, in testing, minio performs significantly worse than pg for
storing both logs and calls (or just logs, I tested that too). minio seems to
have really high cpu consumption, but in any event, we won't be using minio,
we'll be using a cloud object store that implements the s3 api. Anyway, mostly
a knock on using minio for high performance, not really anything to do with
this, just thought it was interesting.

I think it's safe to remove UpdateCall, admittedly this made implementing the
s3 api a lot easier. This operation may also be something we never need, it
was unused at present and was only in the cards for a previous hybrid
implementation, which we've now abandoned. If we need, we can always resurrect
from git.

Also not worried about changing the log key, we need to put a prefix on this
thing anyway, but I don't think anybody is using this anyway. in any event, it
simply means old logs won't show up through the API, but aside from nobody
using this yet, that doesn't seem a big deal breaker really -- new logs will
appear fine.

future:

TODO make logstore implementation optional for datastore, check in front-end
at runtime and offer a nil logstore that errors appropriately

TODO low hanging fruit optimizations of json encoding, re-using buffers for
download, get multiple calls at a time, id reverse encoding could be optimized
like normal encoding to not be n^2

TODO api for range removal of logs and calls

* address review comments

* push id to_time magic into id package
* add note about s3 key sizes
* fix validation check
2018-04-05 10:49:25 -07:00
Reed Allman
8af605cf3d update thrift, opencensus, others (#893)
* update thrift, opencensus, others

* stats: update to opencensus 0.6.0 view api
2018-03-26 15:43:49 -07:00
Denis Makogon
3c15ca6ea6 App ID (#641)
* App ID

* Clean-up

* Use ID or name to reference apps

* Can use app by name or ID

* Get rid of AppName for routes API and model

 routes API is completely backwards-compatible
 routes API accepts both app ID and name

* Get rid of AppName from calls API and model

* Fixing tests

* Get rid of AppName from logs API and model

* Restrict API to work with app names only

* Addressing review comments

* Fix for hybrid mode

* Fix rebase problems

* Addressing review comments

* Addressing review comments pt.2

* Fixing test issue

* Addressing review comments pt.3

* Updated docstring

* Adjust UpdateApp SQL implementation to work with app IDs instead of names

* Fixing tests

* fmt after rebase

* Make tests green again!

* Use GetAppByID wherever it is necessary

 - adding new v2 endpoints to keep hybrid api/runner mode working
 - extract CallBase from Call object to expose that to a user
   (it doesn't include any app reference, as we do for all other API objects)

* Get rid of GetAppByName

* Adjusting server router setup

* Make hybrid work again

* Fix datastore tests

* Fixing tests

* Do not ignore app_id

* Resolve issues after rebase

* Updating test to make it work as it was

* Tabula rasa for migrations

* Adding calls API test

 - we need to ensure we give "App not found" for the missing app and missing call in first place
 - making previous test work (request missing call for the existing app)

* Make datastore tests work fine with correctly applied migrations

* Make CallFunction middleware work again

 had to adjust its implementation to set app ID before proceeding

* The biggest rebase ever made

* Fix 8's migration

* Fix tests

* Fix hybrid client

* Fix tests problem

* Increment app ID migration version

* Fixing TestAppUpdate

* Fix rebase issues

* Addressing review comments

* Renew vendor

* Updated swagger doc per recommendations
2018-03-26 11:19:36 -07:00
Reed Allman
206aa3c203 opentracing -> opencensus (#802)
* update vendor directory, add go.opencensus.io

* update imports

* oops

* s/opentracing/opencensus/ & remove prometheus / zipkin stuff & remove old stats

* the dep train rides again

* fix gin build

* deps from last guy

* start in on the agent metrics

* she builds

* remove tags for now, cardinality error is fussing. subscribe instead of register

* update to patched version of opencensus to proceed for now TODO switch to a release

* meh

fix imports

* println debug the bad boys

* lace it with the tags

* update deps again

* fix all inconsistent cardinality errors

* add our own logger

* fix init

* fix oom measure

* remove bugged removal code

* fix s3 measures

* fix prom handler nil
2018-03-05 09:35:28 -08:00
Gerardo Viedma
966ce58525 Use new metrics API for s3 log metrics (#680)
* use new metrics API for histogram metrics

* Avoid creating an extra tracing span

* use new metrics api for histograms

* fix minor formatting issue
2018-01-15 10:09:03 +00:00
Gerardo Viedma
60d2e92c9a Replace minio-go with aws-sdk-go for s3-compatible log backend (#670)
* Logs should support specifying region when using S3-compatible object store

* Use aws-sdk-go client for s3 backed logstore

* fixes vendor with aws-sdk-go dependencies
2018-01-10 09:44:04 -08:00
Gerardo Viedma
6b91351ed8 Logs should support specifying region when using S3-compatible object store (#645)
Logs should support specifying region when using S3-compatible object store
2018-01-05 17:34:32 +00:00
Reed Allman
2d8c528b48 S3 loggyloo (#511)
* add minio-go dep, update deps

* add minio s3 client

minio has an s3 compatible api and is an open source project and, notably, is
not amazon, so it seems best to use their client (fwiw the aws-sdk-go is a
giant hair ball of things we don't need, too). it was pretty easy and seems
to work, so rolling with it. also, minio is a totally feasible option for fn
installs in prod / for demos / for local.

* adds 's3' package for s3 compatible log storage api, for use with storing
logs from calls and retrieving them.
* removes DELETE /v1/apps/:app/calls/:call/log endpoint
* removes internal log deletion api
* changes the GetLog API to use an io.Reader, which is a backwards step atm
due to the json api for logs, I have another branch lined up to make a plain
text log API and this will be much more efficient (also want to gzip)
* hooked up minio to the test suite and fixed up the test suite
* add how to run minio docs and point fn at it docs

some notes: notably we aren't cleaning up these logs. there is a ticket
already to make a Mr. Clean who wakes up periodically and nukes old stuff, so
am punting any api design around some kind of TTL deletion of logs. there are
a lot of options really for Mr. Clean, we can notably defer to him when apps
are deleted, too, so that app deletion is fast and then Mr. Clean will just
clean them up later (seems like a good option).

have not tested against BMC object store, which has an s3 compatible API. but
in theory it 'just works' (the reason for doing this). in any event, that's
part of the service land to figure out.

closes #481
closes #473

* add log not found error to minio land
2017-11-20 17:39:45 -08:00