-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resource watcher, log watcher, libcloud, object_storage #31
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, very very nice to see a not-gigantic, mostly understandable piece of code already! I'll need to read it a few times to fully grasp what is happening, and what we'd want for the project, but I've already added some comments that may help shape the direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed it a bit more. Still happy that what is actually happening, is not that complex to understand. That is a sign of clear coding, well done!
I think the next steps would be:
- Move more job logging related code to its own module (incl. objectstorage)
- Polish
api.py
related to this change - Consistently use
joblogs
terminology (or whatever we finally settle on) - Find a clean solution for storage configuration
- Error handling and/or logging
- The other smaller security- and performance-related things
- Make sure CI fully passes
Regarding integration tests, that is up to you where and if you'd like to include it in this PR.
And regarding naming: I would propose to call this feature I think that ideally we'd move all files related to this feature to a subdirectory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. I think it's getting into a good shape, well done! Some questions pending, but nothing that big, I think.
One general thought is that most of the code uses classes, with exception from the http handlers, and utility functions. This is a larger piece of functionality, that, from the outside, I would think would fit the use of a class well.
And thinking more generally about procedural vs. (semi) object-oriented, I think the latter has benefits in code re-use. For example, consider the possibility that scrapyd-k8s would somehow evolve to talk to different clusters at the same time. Then having these global variables with state, would prohibit the current code to be used. When using classes, this state (and setting up connections to k8s etc.) naturally happens within the class instance context, and it would be much easier to get this working. Now this is just a thought exercise, but would nudge me towards preferring the use of a class for job log handling. And I also realize, that when a Docker-implementation would appear, this would naturally be configured by setting a joblog implementation, which would be a class as well.
So if you manage, would you be able to use a class for the Kubernetes joblogs implementation?
Thank you for your feedback!
I do agree with your reasoning and I will work on that, also a great moment for me to learn a bit more about design patterns that may be beneficial for this case. |
I've addressed the comments and also changed the structure to class implementation. The feature contains many methods and lines of code, so I decided to add docstrings to give a short overview on what is going on there. I think it takes less energy to simply read a description and then decide if you want to enable the feature than to read through the entire code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, changes look good!
Great you added docstrings, very helpful. I'm not fully up-to-speed with Python docstring styles, perhaps it's good to check this with others in the team?
…ting instance of the Config class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super! I think this PR is almost finished.
Ideally, this feature would also be tested on CI. But that is still quite some work, perhaps, requires starting e.g. a minio server, and such. But that would really be helpful. Maybe we can create a new issue for this?
scrapyd_k8s.sample-k8s.conf
Outdated
|
||
;[joblogs] | ||
;# Choose storage provider | ||
;storage_provider = s3 | ||
;container_name = <container_name> | ||
|
||
;# Choose number of unique logs, but at least 2 | ||
;num_lines_to_check = 2 | ||
|
||
;[joblogs.storage.s3] | ||
;key = ${S3_KEY} | ||
;secret = ${S3_SECRET} | ||
;region = ${S3_REGION} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can use we #
to comment this out instead? ;
looks like it's fine, but it's less common to use for commenting out.
Also, let's include a real-world (commented-out) example here, so use a container_name like scrapyd-k8s-example-bucket
, and explain in a comment with the storage, that one needs to set the S3_KEY
and S3_SECRET
environment variables. Please include region
, so it's clear to readers that one can mix and match between environment and string, also in the storage.s3 section
scrapyd_k8s/api.py
Outdated
@@ -148,5 +146,7 @@ def run(): | |||
if config_username is not None and config_password is not None: | |||
enable_authentication(app, config_username, config_password) | |||
|
|||
launcher.enable_joblogs(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about putting this in a conditional like if config.joblogs() is not None
, I think that makes it more readable, now I think each time I read it: hey, is joblogs always enabled or not?
This would also avoid a warning about joblogs not being enabled when not configured. I think of this feature as an add-on, not as something that everyone will use.
logger.warning(f"File not found: {file_path}") | ||
return [] | ||
|
||
def concatenate_and_delete_files(self, main_file_path, temp_file_path, block_size=6144): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same block size here, this really could use a class variable
scrapyd_k8s/launcher/docker.py
Outdated
@@ -66,6 +65,10 @@ def cancel(self, project_id, job_id, signal): | |||
c.kill(signal='SIG' + signal) | |||
return prevstate | |||
|
|||
def enable_joblogs(self, config): | |||
# Do nothing, since joblogs is not supported on Docker | |||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the very least we can do is log an error message here, right?
… not work for docker; refactored code to address minor changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overal looks OK, interesting approach for handling the logging challenges!
There are quite a few functions that suffer from a being very lengthy. If you have the chance maybe see if you can refactor this a bit. One thing that I would definitely change is the use of the config class. Introduce some properties there that make it a lot easier to check for certain things in the config. This is something that happens in multiple places throughout the code, and would be nice to have just in 1 location.
It seems like there are also no QL check in this project. Even though that is the case, it would still be beneficial to use these tools during your own development to see how you can improve your changes to the codebase. This of course keeping in mind with other decisions that were made beforehand.
@@ -148,5 +148,11 @@ def run(): | |||
if config_username is not None and config_password is not None: | |||
enable_authentication(app, config_username, config_password) | |||
|
|||
if config.joblogs() is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets introduce a property in the config class that returns a bool and use that to check here instead of calling a function that is supposed to return something that we check. I would suggest splitting it up like
@property
def joblogs_is_configured(self):
return self._config.has_section('joblogs')
def joblogs(self):
return self._config['joblogs'] if self.joblogs_is_configured else None
Then we can improve this line here by checking the property
if config.joblogs_is_configured:
...
Not sure if joblogs_is_configured
is the right property name, but just as a suggestion. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Wesley, thanks for your review. I don't have anything against property or decorator pattern, but since we don't use it, it really belongs to another PR about refactoring of the whole project, not my logging feature.
None | ||
""" | ||
joblogs_config = config.joblogs() | ||
if joblogs_config and joblogs_config.get('storage_provider') is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also move the logic for checking the storage provider to the config class so if we need it somewhere else we would only need to change the logic within the config class in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging feature is an additional "package" you can use, it's not default, so cluttering core code components with it not a good way, in my opinion.
block_size = self.DEFAULT_BLOCK_SIZE | ||
data = b'' | ||
remaining_lines = num_lines | ||
while remaining_lines > 0 and file_size > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to have some description of the algorithm somewhere in the docstring or as inline comments. Perhaps we could also split this up a few different functions, perhaps entirely as a separate utility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several comments across the algorithm, otherwise a special description that explains how it works belongs to wiki or something like that. If you have specific lines where you expect a specific comment, let me know, because to me the algorithm is clear.
log_lines_counter = 0 | ||
|
||
if len(last_n_lines) > log_lines_counter: | ||
self.concatenate_and_delete_files(log_file_path, temp_file_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should happen if an exception is raised here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be caught because the whole thing is wrapped inside try/except. Is this the answer you were looking for?
------- | ||
None | ||
""" | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapping this entire logic inside a try except and catching a bare exception is not very clean. Perhaps wrap the function where we call it and handle what needs to happen when an exception is raised. But first lets see if we can make the except more specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I changed it so only specific exceptions are caught
if self.pod_tmp_mapping.get(job_name) is not None: | ||
return self.pod_tmp_mapping[job_name] | ||
else: | ||
temp_dir = tempfile.gettempdir() | ||
app_temp_dir = os.path.join(temp_dir, 'job_logs') | ||
os.makedirs(app_temp_dir, exist_ok=True) | ||
fd, path = tempfile.mkstemp(prefix=f"{job_name}_logs_", suffix=".txt", dir=app_temp_dir) | ||
os.close(fd) | ||
self.pod_tmp_mapping[job_name] = path | ||
return path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.pod_tmp_mapping.get(job_name) is not None: | |
return self.pod_tmp_mapping[job_name] | |
else: | |
temp_dir = tempfile.gettempdir() | |
app_temp_dir = os.path.join(temp_dir, 'job_logs') | |
os.makedirs(app_temp_dir, exist_ok=True) | |
fd, path = tempfile.mkstemp(prefix=f"{job_name}_logs_", suffix=".txt", dir=app_temp_dir) | |
os.close(fd) | |
self.pod_tmp_mapping[job_name] = path | |
return path | |
if self.pod_tmp_mapping.get(job_name) is not None: | |
return self.pod_tmp_mapping[job_name] | |
temp_dir = tempfile.gettempdir() | |
app_temp_dir = os.path.join(temp_dir, 'job_logs') | |
os.makedirs(app_temp_dir, exist_ok=True) | |
fd, path = tempfile.mkstemp(prefix=f"{job_name}_logs_", suffix=".txt", dir=app_temp_dir) | |
os.close(fd) | |
self.pod_tmp_mapping[job_name] = path | |
return path |
return False | ||
except Exception as e: | ||
logger.exception(f"An unexpected error occurred while checking for object '{object_name}': {e}") | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove 4 of returns by moving the return False to the end of the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thank you, I was adding exceptions sequentially while reading docs and experimenting with different scenarios, so I ended up in this ugly piece of code:D Fixed!
Thanks for your comments, @leewesleyv! I think moving config file handling to the config class would make sense (though I there are also arguments to keep joblogs-related code in the joblogs parts). Then if the joblogs-related code in the config class would grow too much (which I don't think will happen with this PR, but I might be wrong), then it could warrant its own joblog-config class. So I'd say: go ahead with @leewesleyv's suggestions, if you think it's a good idea. |
…otes in all logging statements; refactor to single False statement in object exists
I think all the comments were addressed, is there anything else I can do or we are ready to merge?:) |
I think it would be ready to merge - and it's time to try this in production 👍 |
As a separate thread, resource watcher was added. When a new pod with specific labels (job labels) are aded to the cluster and are in a running state, another watcher starts to follow it's logs and write it to a file stored on a managing pod with scrapyd. Then all files are uploaded to an object storage of a preferred provider, which is implemented via Apache Libcloud library. Persistent Volume and PVC are commented out in the Kubernetes yaml file and were not tested in the cloud, but it can be used to add them to the managing pod to store log files in a persistent volume.