-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does not work for large clusters #22
Comments
Whoah that's quite a lot of Regionservers. I am aware that Hannibal is currently not usable for such large installations. However, you came pretty far, I would have guessed Hannibal crashes a lot earlier ;-) A colleague and me just looked at the request and response and think it can be improved quite a lot:
However, I fear that this won't be enough. Adjustments have to be done at the UI to be usable for such huge amounts of data. Another bottleneck could be the communication between Hannibal and the Regionservers. And we should consider to make the recording of metric/compactions optional altogether or configurable on a per-region basis. If someone wants to start working on some of those issues, let me know if I can assist you. |
Sorry for the late response, I started working on it a bit. The main thing I noticed when looking at the server side code was that for each graph we make calls out to every region server for data points. When I created a cache that would use a background thread to update at an interval of every 5 to 10 minutes for example this helped a great deal in response times in loading the graphs. Still the view layer was somewhat of an issue, but the region cache improved performance quite a bit. A crude patch would be located here: please forgive my scala code, I know its pretty horrible - just a POC. The view layer refactoring seems like a daunting task as I would probably screw things up, but I would be up for helping create API's for each graph thus the rendering layer refactoring could be done incrementally. |
Thanks very much for the commit, this looks like a great improvement. I also like that you cleaned up the model a bit :-) However I think we'll have to reduce the intervall of 30 minutes per default and introduce a configuration value for that. Also I have to think about how we'll sync this up with the regioninfo metrics as it doesn't make any sense to record the same cashed values over and over again. Maybe we should change the update of the regioninfo metrics so that they are recorded just after the cache gets updated. I think we should introduce the following configuration values (dunno wether the defaults are good values though):
I hope I can implement it soon. |
I added most of your code and added the new configuration values (names differ a bit to the previous proposed ones). |
I built hannibal, changed the storage layer to mysql from h2 and we have around 1000 regionservers with around 300k regions in total. Because so much of the logic is done at the view layer it takes minutes to load. The response size for some of the requests is > 200 MB and too much of the logic in terms of sorting and combining is done at the view layer. Thus making hannibal which is an awesome tool unusable.
The text was updated successfully, but these errors were encountered: