Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI Workflow Action #50

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open

Conversation

weklund
Copy link

@weklund weklund commented Feb 13, 2025

Github workflow that adds CI to Ibind.

Requirements based from #45

  • Add ci.yaml for workflow
  • Configure Ruff
  • Configure Bandit

I've added ruff lint selectors that I believe is all PEP8 rules. Check out the linting errors from my runs: https://github.com/weklund/ibind/actions/runs/13299790209

Would like feedback on what was found, I'm happy to resolve them, or add those rules to the ignore list.

Copy link
Owner

@Voyz Voyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weklund superb work altogether! I've fixed most of the issues on master branch.

What do you think we should do with the oauth1a.py? There's a bunch of errors flagging up there, but this is code almost entirely copied over from IBKR.

  • On one hand, we should leave it as is, so that it's easier to just bring in whatever fixes and changes IBKR rolls out.
  • On the other hand, we should take responsibility of that code and fix any issues with it. This means if IBKR rolls out updates we'd need a bit more effort to bring these in.

Thoughts on this?

]

ignore = [
"E501", # Ignore line length errors
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna bring up ignoring 'E712' - since in several locations we do care about doing if x==False rather than if not x, seeing that x can be None. Any objections?

Copy link
Author

@weklund weklund Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I actually was going to consider adding 2-3 other rulesets that I think are helpful for code quality, but I know we'll be opinionated and want to ignore several of the rules. Do you mind I add E712, and add the additional rulesets so we can see what we want to ignore?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, what other rulesets you think of adding?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[
"PL", # pylint
"DOC", # Documentation
"D", # pydocstyle
]

The findings could end up being pretty pedantic, more out of curiosity if we end up getting new findings that we actually see value in.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is resolved now?

Copy link
Author

@weklund weklund Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure exactly what you're referring to regarding resolved. We have added several rule sets, and accompanying ignores that we either don't find value, or we're punting on for future refactors.

As of now ruff is passing with the current config

@weklund
Copy link
Author

weklund commented Feb 13, 2025

On the other hand, we should take responsibility of that code and fix any issues with it.

@Voyz I'm inclined on this. Good ownership for our users, regardless of how many times IB has changes (Hopefully not many) I'm happy to take a first stab at it, and if we could find a volunteer that has oauth onboarding to test the branch while we work on the PR

@weklund
Copy link
Author

weklund commented Feb 13, 2025

@Voyz Just for reference, once we get linting green, then the workflow will move on to the bandit scan, and we'll likely have a similar workflow here where we fix/ignore to get to green before the PR is ready.

@weklund
Copy link
Author

weklund commented Feb 13, 2025

Also side note, do you have a preference on draft PRs vs regular? Some teams like PRs in 'draft|wip' mode until the PR author is ready to be merged in or not.

I have zero preference just wondering if you already had a convention here.

* master: (68 commits)
  updated README
  small fix to tests
  renamed QueueControllerClass and SubscriptionProcessorClass arguments in IbkrWsClient to lower camel case
  minor lint fixes
  turned oauth into a subpackage of ibind and renamed oauth.py to oauth1a.py
  renamed OAuth nomenclature to OAuth 1.0a (or OAuth1a) in anticipation of OAuth 2.0 support
  removed print(client.live_session_token)
  added ability to catch request errors in IbkrClient and added better handling of 'no bridge' error
  breaking: removed redundant 'publish' parameter from initialize_brokerage_session which needs to always be set to True
  added support for calling initialize_brokerage_session in init_oauth
  added IBIND_OAUTH_WS_URL and default to it if no `url` is passed to IbkrWsClient
  maintain gateway support
  fix websocket oauth
  added oauth for websocket support
  updated rest_client tests
  added OAuthConfig.copy
  simplified oauth optional dependencies
  added oauth_rest_url to OAuthConfig
  simplified requirements-oauth.txt
  updated inline docs for IbkrClient
  ...
@weklund
Copy link
Author

weklund commented Feb 14, 2025

re: oauth1a, looking at this a bit closer, a lot of these initial linting errors is from using all caps where the variable is only local to the function. I've normally seen all caps when you're defining a constant that's more global that a bunch of functions will use.

like:

  • INT_BASE
  • STRING_ENCODING

And in each function that redefines, it is the same value. So to resolve instances where there's multiple uses, I'm going to shift it up, and keep the all caps. Other instances where it's only being used once I'll make lower case.

@Voyz
Copy link
Owner

Voyz commented Feb 14, 2025

Good ownership for our users, regardless of how many times IB has changes (Hopefully not many)

Very good point. And yes, IBKR hardly ever updates this things historically. Thanks for sharing your thoughts, I agree this is the best way to go ahead.

Also side note, do you have a preference on draft PRs vs regular?

Zero preference from my side as well. Sticking to regular PRs sounds good 👍

I'm going to shift it up, and keep the all caps. Other instances where it's only being used once I'll make lower case.

Superb! I appreciate you handling this, ping me when I could give it another review

@weklund
Copy link
Author

weklund commented Feb 14, 2025

https://github.com/weklund/ibind/actions/runs/13320132207/job/37203063677

Over 4000 lines of linting to scroll through ha. Let's see what AI can summarize. Any of these look valuable? @Voyz

The linting output you've provided highlights several types of issues in your codebase. Here's a summary of the main issues:

  1. Docstring Formatting (D400, D415, D205, D401, D200):
  • D400: The first line of a docstring should end with a period.
  • D415: The first line of a docstring should end with a period, question mark, or exclamation point.
  • D205: There should be one blank line between the summary line and the description in a docstring.
  • D401: The first line of a docstring should be in the imperative mood (e.g., "Return" instead of "Returns").
  • D200: One-line docstrings should fit on one line.
  1. Missing Docstrings (D100, D101, D102, D103, D104, D105, D107):
  • D100: Missing docstring in public module.
  • D101: Missing docstring in public class.
  • D102: Missing docstring in public method.
  • D103: Missing docstring in public function.
  • D104: Missing docstring in public package.
  • D105: Missing docstring in magic method.
  • D107: Missing docstring in init.
  1. Environment Variable Default Type (PLW1508):
  • The default value for an environment variable should be a string or None, not a boolean. For example,
  • os.getenv('IBIND_CACERT', False) should be os.getenv('IBIND_CACERT', 'default_value') or os.getenv('IBIND_CACERT', None).
  1. Magic Values (PLR2004):
  • Avoid using magic values directly in the code. Consider replacing them with a constant variable for better readability and maintainability.
  1. Too Many Arguments (PLR0913):
  • Functions or methods have too many arguments, which can make them difficult to use and understand. Consider refactoring to reduce the number of arguments.
  1. Redefining Arguments (PLR1704):
  • Avoid redefining arguments within the function body, as it can lead to confusion and errors.

To address these issues, you should

  1. Ensure all public modules, classes, methods, and functions have appropriate docstrings.
  2. Follow the PEP 257 conventions for docstring formatting.
  3. Use appropriate default values for environment variables.
  4. Replace magic values with named constants.
  5. Refactor functions with too many arguments to improve readability.
  6. Avoid redefining arguments within functions.

@Voyz
Copy link
Owner

Voyz commented Feb 14, 2025

Well frankly some of these are great! Some may need some lee way though.

Docstring Formatting (D400, D415, D205, D401, D200):

All these sound silly and redundant to me. I cannot think how would this improve the code base. Additionally, the mixin comments are all 1-1 from IBKR docs, I'd feel it would be an unnecessary chore to try to come up with our own documentation only to meet these rules. What do you think?

Missing Docstrings (D100, D101, D102, D103, D104, D105, D107):

This one is on a similar category, but I'm paying attention a bit more.

  • D100 is interesting, I wasn't aware it is considered a good standard to do a docstring on every module. Should we?

For the rest I'm on a fence. There's a ton of small methods or classes that are public but not sure if bloating them with docstrings would be a good or a bad move. Some come from examples or tests, where I'm not sure we should be writing docstrings, or care about their formatting. I think some kind of configuration to point this only at certain folders could help figuring out if these are meaningful - can that be done?

Environment Variable Default Type (PLW1508):

Half of these come from examples, which I give a pass. The other half come from valid areas where ints are parsed, eg:

IBIND_WS_TIMEOUT = int(os.getenv('IBIND_WS_TIMEOUT', 5))

I don't know if changing that 5 to a "5" would make things less confusing to anyone?

The only area where this actually properly highlights an issue is here, in var.py:

IBIND_CACERT = os.getenv('IBIND_CACERT', False)

But the expected interaction here is that user always sets this value to a path string. If it is absent, we do want it to be False, because this is how requests' verify parameter works - it accepts either. We never expect user to set this value to "False" in their env var.

Magic Values (PLR2004):

These exclusively HTTP error codes, or are found in tests. I think I'd give this a pass.

Too Many Arguments (PLR0913):

That I'd love to fix, just no idea how. Any suggestions?

Redefining Arguments (PLR1704):

That's actually quite useful, though only appears in one place, I'll try fixing it. Should I do it on the master branch and you can merge it in?

@weklund
Copy link
Author

weklund commented Feb 16, 2025

I'd feel it would be an unnecessary chore to try to come up with our own documentation only to meet these rules. What do you think?

+1, let's add these to ignore.

docstring on every module. Should we?

I can go either way here. I'm less inclined since ibind is currently a bit smaller of a code base, however generally I'm slowly subscribing to the idea that verbose (while potentially ugly to look at) doc blocks helps future development/understanding, particularly with AI tooling. Giving the tools as much as context as possible should help when we have various prompts for analysis or development with less issues.

Pedantic and verbose comment documentation kind goes against the vision for easy readability of Python syntax so I'm on the fence.

I don't know if changing that 5 to a "5" would make things less confusing to anyone?

Hmm to make sure I understand, this finding is calling out that getenv is expecting a string and the interpreter is forgiving here to take an int? Does that not an overloaded parameter?

Generally I think this a good rule to keep to make sure we don't create new functions with overloaded parameters, but a/ might unlikely happen b/ having a fix for it might be more confusing than just os.getenv - so ultimately I think we should ignore this rule.

I think I'd give this a pass

To ignore right? I think so. I didn't see a lot of other http codes mentioned. I only saw 400, 404, 401, and some values for test assertions. If we had various 2xx, 3xx, 4xx, 5xx http codes throughout the package then I think it would make sense to use enums from https://docs.python.org/3/library/http.html#http.HTTPStatus

Safer and less likely to introduce errors.

That I'd love to fix, just no idea how. Any suggestions?

Maybe? that I'd have to think on it, but I imagine it would require a descent amount of rewrite of the rest client. If we want to punt on it, we could add a single ignore for it above the finding, so we don't ignore the whole ruleset.

Redefining Arguments (PLR1704):

Sure thing.

@salsasepp
Copy link

Love your work on linting, I'm looking at ruff output for the first time. Learning a lot, thanks!

Too Many Arguments (PLR0913):

That I'd love to fix, just no idea how. Any suggestions?

Maybe? that I'd have to think on it, but I imagine it would require a descent amount of rewrite of the rest client. If we want to punt on it, we could add a single ignore for it above the finding, so we don't ignore the whole ruleset.

  • I believe there's no PEP specifying what "Too Many" should mean. Setting the configurable ruff option to, say, 7, will remove many occurrences. Is 7 "too many"? Would that be considered cheating?
  • The biggest offender is the make_order_request() function, with 26 arguments. I'm not at all sure it's appropriate to suggest code changes just because a linter doesn't like something, but this function does nothing but convert a lengthy argument list to a dict, minus the None values. Could the same be accomplished with less code that is more maintainable, maybe like so?
from dataclasses import dataclass, asdict

@dataclass
class OrderRequest:
    conid: int | str
    side: str
    # more mandatory fields
    price: float | None = None
    # more optional fields

    def asdict(self) -> dict:
        return asdict(self, dict_factory=lambda x: {k: v for (k, v) in x if v is not None})

Please let me know if I this is off topic for an initial CI workflow. I am lacking experience in how far one should go in making a linter happy.

@Voyz
Copy link
Owner

Voyz commented Feb 17, 2025

@weklund

I'd say let's pass on module documentation for now then. If we encounter a good reason that would tip the scales let's revisit it.

Hmm to make sure I understand, this finding is calling out that getenv is expecting a string and the interpreter is forgiving here to take an int?

Not so much the interpreter, it's just a some kind of consistency rule along the lines of "a method should return only one type". Since getenv will always return a string when it finds an env var, it is expected that its default= parameter would also be a string, since this would give a consistent behaviour - always returning a string.

But in cases where the env var is expected to be an int or a bool, it is always directly fed into something like int() or to_bool which will turn them into primitives. Either "5" or 5 when fed into int() will result in a 5 int. Hence this warning flag seems unnecessary.

To ignore right? I think so.

Yeah, let's ignore it.

Maybe? that I'd have to think on it, but I imagine it would require a descent amount of rewrite of the rest client.

Yeah, that plus several other methods or constructors. IbkrWsClient has a pretty broad constructor too. That we could maybe break down a bit, but don't think we'd reach below 5.

RestClient._request method would be more trouble though:

    def _request(
            self,
            method: str,
            endpoint: str,
            base_url: str = None,
            extra_headers: dict = None,
            attempt: int = 0,
            log: bool = True,
            **kwargs
    ) -> Result:

Except for base_url, I wouldn't know how to reduce this any further.

Redefining Arguments (PLR1704):

Fixed 👍

The biggest offender is the make_order_request() function, with 26 arguments. I'm not at all sure it's appropriate to suggest code changes just because a linter doesn't like something, but this function does nothing but convert a lengthy argument list to a dict, minus the None values. Could the same be accomplished with less code that is more maintainable, maybe like so?

@salsasepp great suggestion, appreciate you chipping in 👍

The make_order_request has other function other than just storing parameters and setting defaults: the variable names that IBKR API expects are in lower cammel case, while Python uses lowercase with underscores. This function performs this translation from Python to IBKR.

In all fairness, moving it to a dataclass like you suggest wouldn't be a bad idea, but dictionary has the advantage in that it can be easily extended should a new parameter appear on the API. With a dataclass users won't be able to attach that new parameter easily without monkey patching, or overriding that dataclass.; with a dict adding a new parameter is legal and easy.

We could work around that too, but this is just one of the many places where the linter observed too many parameters, so let's see if we can work out a solution that would fix things across the whole library first, and return to this idea if we fail to come up with anything else.

@salsasepp
Copy link

The make_order_request has other function other than just storing parameters and setting defaults: [...] This function performs this translation from Python to IBKR.

Thank you! Totally overlooked that, my apologies. Makes my proposal obsolete.

I can maybe see how dataclasses could reduce the argument burden with some of our __init__() functions, reducing boilerplate and removing a few PLR0913. But not worth exploring in earnest I think.

@weklund
Copy link
Author

weklund commented Feb 17, 2025

Looks like AI missed a few. I made single instance ignores for rest client with todos attached. Here are other code quality based findings:

ibind/client/ibkr_utils.py:103:13: PLW2901 `for` loop variable `instrument` overwritten by assignment target
    |
102 |             # if all conditions are  met, accept the instrument and its contracts
103 |             instrument = {**instrument, 'contracts': filtered_contracts}
    |             ^^^^^^^^^^ PLW2901
104 |
105 |         filtered_instruments.append(instrument)
    |


ibind/client/ibkr_utils.py:267:5: PLR0912 Too many branches (26 > 12)
    |
267 | def make_order_request(
    |     ^^^^^^^^^^^^^^^^^^ PLR0912
268 |         conid: Union[int, str],
269 |         side: str,
    |

ibind/client/ibkr_utils.py:267:5: PLR0915 Too many statements (54 > 50)
    |
267 | def make_order_request(
    |     ^^^^^^^^^^^^^^^^^^ PLR0915
268 |         conid: Union[int, str],
269 |         side: str,
    |

ibind/support/logs.py:54:12: PLW0603 Using the global statement to update `_initialized` is discouraged
   |
53 |     """
54 |     global _initialized
   |            ^^^^^^^^^^^^ PLW0603
55 |     if _initialized:
56 |         return
   |

ibind/support/py_utils.py:313:17: PLW2901 `for` loop variable `value` overwritten by assignment target
    |
311 |         if value is not None and value != [None]:
312 |             if preprocessors is not None and key in preprocessors:
313 |                 value = preprocessors[key](value)
    |                 ^^^^^ PLW2901
314 |             d[key] = value
    |

There were other instances of too many arguments but I omitted them, we get the picture.

Just a heads up, I wanted to bring awareness to these other findings, but I think it's ok if we punt them, as long as we document in some way like TODOs to complete as a backlog item.

@Voyz
Copy link
Owner

Voyz commented Feb 18, 2025

Totally overlooked that, my apologies. Makes my proposal obsolete.

@salsasepp no, I think your idea was actually the correct one, well done 👍 I read up a bit more about this and it seems that going config dataclasses is the most optimal way to do this. We can work around the translation part when using dataclasses. I'll try implementing this

PLW2901

Fixed it in one place, but in ibind/support/py_utils.py:313:17 it kind of makes sense to me.

    for key, value in optional.items():
        if value is not None and value != [None]:
            if preprocessors is not None and key in preprocessors:
                value = preprocessors[key](value)
            d[key] = value

Thoughts on ignoring this?

PPLW0603 Using the global statement to update _initialized is discouraged

Right, I know globals are bad, I dislike using them myself. Thing is, logging module and its loggers are global, and if we allow the ibind_logs_initialize function run multiple times, we'd either have to bring in some complexity to manage loggers' handlers better, or we'd end up with too many handlers attached to loggers. Doing this global _initialized flag allows us to ensure that this only runs once and that's it. Thoughts on this?

Just a heads up, I wanted to bring awareness to these other findings, but I think it's ok if we punt them, as long as we document in some way like TODOs to complete as a backlog item.

Sure! Thought I enjoy being challenged by it and ask myself questions on whether things are correct.

@weklund
Copy link
Author

weklund commented Feb 26, 2025

Doing this global _initialized flag allows us to ensure that this only runs once and that's it. Thoughts on this?

Hmm, what do you think about using a singleton here? Essentially having a LoggingManager that's instantiated once that can hold the state of both _initialized and _log_to_file - should also be cleaner to test a singleton entity in the future where don't we have to setup a global var, then delete it before the next test.

class LoggingManager:
    _instance = None
    
    def __new__(cls):
        if cls._instance is None:
            cls._instance = super(LoggingManager, cls).__new__(cls)
            cls._instance._initialized = False
            cls._instance.log_to_file = False
        return cls._instance
    
    def initialize(
            self,
            log_to_console: bool = var.LOG_TO_CONSOLE,
            log_to_file: bool = var.LOG_TO_FILE,
            log_level: str = var.LOG_LEVEL,
            log_format: str = var.LOG_FORMAT,
    ):
        if self._initialized:
            return
            
        self._initialized = True
        self.log_to_file = log_to_file
        
        logger = logging.getLogger('ibind')
        if not self.log_to_file:
            logging.getLogger('ibind_fh').addFilter(lambda record: False)
            
# Create singleton instance
log_manager = LoggingManager()

# Wrapper functions that use the singleton
def ibind_logs_initialize(**kwargs):
    """
    Wrapper for backward compatibility
    """
    return log_manager.initialize(**kwargs)

   

PLW2901 for loop variable value overwritten by assignment target

So I believe the logic can stay the same, the finding is around the seeing a mutation of the same variable name as the loop variable. This might just go away by doing:

for key, value in optional.items():
    if value is not None and value != [None]:
        if preprocessors is not None and key in preprocessors:
            processed_value = preprocessors[key](value)
        else:
            processed_value = value
        d[key] = processed_value

Regarding the other findings with too many arguments or too many branches: We can either bump the argument value to get to passing, or add single line ignores on each instance. I'm not too picky here.

@Voyz
Copy link
Owner

Voyz commented Feb 26, 2025

@weklund

Hmm, what do you think about using a singleton here? Essentially having a LoggingManager that's instantiated once that can hold the state of both _initialized and _log_to_file - should also be cleaner to test a singleton entity in the future where don't we have to setup a global var, then delete it before the next test.

Thanks for prepping all that example code, its super helpful to see it right away 👍 I'm - possibly incorrectly - allergic to singletons ever since my uni professor used them everywhere 🤣

Jokes aside though, can you expand on how this would help, other than to remove that lint warning? Python's logging module already is a singleton, and by using _initialized global variable we continue that pattern with our ibind logger. Our logs module-signleton can be reloaded with importlib.reload() for sake of tests if needed. How would doing a singleton class help here? I know that the lint warning would disappear, just trying to gauge if there are other benefits here - in particular the one you mentioned with tests.

In fact, just deleting the LoggingManager singleton instance wouldn't suffice as a reset, since logging's logger would persist and its handlers and filters would be doubled if we tried recreating the singleton class. If we want that we'd need to add code - either a function currently, or a method in your singleton class - that would strip the existing loggers/filters, and only then allow reinitialisation.

So I believe the logic can stay the same, the finding is around the seeing a mutation of the same variable name as the loop variable.

Nice solution, thanks 👍

Regarding the other findings with too many arguments or too many branches: We can either bump the argument value to get to passing, or add single line ignores on each instance. I'm not too picky here.

I think we may just ignore some areas, like the IbkrWsClient constructor.

@Voyz
Copy link
Owner

Voyz commented Mar 9, 2025

Hey @weklund just wanted to check in and see if we possibly could start closing in on introducing this? What are the outstanding questions here?

@weklund
Copy link
Author

weklund commented Mar 10, 2025

Hey! Yep let's close this sooner rather than later :)

On the logging topic, I do still think we would get value by having it's own singleton class. Now the value is super clear when we have tests, and when we extend the logging functionality. There's a case to be made to wait until those things happen.

Value proposition:

  1. Encapsulation of state (already discussed)

Rather than having disconnected global variables like _initialized and _log_to_file, a singleton class keeps related state together in a single logical unit.

  1. Cleaner testing (already discussed)

You're absolutely right that simply removing a singleton instance wouldn't reset the underlying logging system. However, with a class we could add an explicit reset method:

def reset(self):
    """Reset logging state"""
    self._initialized = False
    self.log_to_file = False
    
    # Clean up handlers on existing loggers
    logger = logging.getLogger('ibind')
    for handler in logger.handlers[:]:  # Copy the list to avoid modification during iteration
        logger.removeHandler(handler)
    
    fh_logger = logging.getLogger('ibind_fh')
    for filter in fh_logger.filters[:]:
        fh_logger.removeFilter(filter)
  1. Explicit API

A class provides a clearer interface for managing logging behavior, making it easier for new developers to understand how logging is configured. Maybe it's just my opinion, but a module just doesn't seem like an explicit entity that we use logically. Open to being wrong.

  1. Future extensibility:

If logging features grow, a class-based approach scales better than adding more globals like we are now.

weklund added 2 commits March 9, 2025 20:11
* master:
  added fixes for automatically registering signal handler on non-main thread
  v0.1.12
  added load_dotenv monkey patch to display a warning if it is called after ibind has been imported
  updated .gitignore
  removed usage of _testcapi in test_utils.py and updated RaiseLogsContext with inline docs and docstrings
  Revert .gitignore
  Revert pyproject.toml
  Update Tickler to use threading events for stopping. No longer needs to wait for
  fix unit test
  added global default_filtering flag
  added OAuthConfig.verify_config
  changed 'localhost' default to '127.0.0.1'
  added extra notes to live_orders
  removed debug statements
  added use_session parameter and reworked the implementation of sessions to make it optional. Shutdown handling has been moved to RestClient, though `OAuth1aConfig.shutdown_oauth` parameter still defines whether OAuth gets closed when shutting down
  added way to specify custom parse_order_request mapping
  added is_close to OrderRequest parameters
  changed requests.requests for requests.Session
@weklund
Copy link
Author

weklund commented Mar 10, 2025

Pending alignment on the global vars topic, that finishes the ruff config.

The last piece for this PR is the bandit security scanning. It does look like the finding is around pyCrypto library. I assume we don't want to rehash all the hard work you did for OAuth 😁 , but leveraging a deprecated library for security logic isn't ideal. What was the issue again that we couldn't use the cryptography library?

bandit -r . -ll -x site-packages
[main]	INFO	profile include tests: None
[main]	INFO	profile exclude tests: None
[main]	INFO	cli include tests: None
[main]	INFO	cli exclude tests: None
[main]	INFO	running on Python 3.11.11
Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
Run started:2025-03-10 00:39:41.060095

Test results:
>> Issue: [B413:blacklist] The pyCrypto library and its module PKCS1_v1_5 are no longer actively maintained and have been deprecated. Consider using pyca/cryptography library.
   Severity: High   Confidence: High
   CWE: CWE-327 (https://cwe.mitre.org/data/definitions/327.html)
   More Info: https://bandit.readthedocs.io/en/1.[8](https://github.com/weklund/ibind/actions/runs/13754274780/job/38459270055#step:6:9).3/blacklists/blacklist_imports.html#b413-import-pycrypto
   Location: ./ibind/oauth/oauth1a.py:10:0
[9](https://github.com/weklund/ibind/actions/runs/13754274780/job/38459270055#step:6:10)	
10	from Crypto.Cipher import PKCS1_v1_5 as PKCS1_v1_5_Cipher
11	from Crypto.Hash import SHA256, HMAC, SHA1

--------------------------------------------------
>> Issue: [B413:blacklist] The pyCrypto library and its module SHA256 are no longer actively maintained and have been deprecated. Consider using pyca/cryptography library.
   Severity: High   Confidence: High
   CWE: CWE-327 (https://cwe.mitre.org/data/definitions/327.html)
   More Info: https://bandit.readthedocs.io/en/1.8.3/blacklists/blacklist_imports.html#b413-import-pycrypto
   Location: ./ibind/oauth/oauth1a.py:11:0
[10](https://github.com/weklund/ibind/actions/runs/13754274780/job/38459270055#step:6:11)	from Crypto.Cipher import PKCS1_v1_5 as PKCS1_v1_5_Cipher
[11](https://github.com/weklund/ibind/actions/runs/13754274780/job/38459270055#step:6:12)	from Crypto.Hash import SHA256, HMAC, SHA1
[12](https://github.com/weklund/ibind/actions/runs/13754274780/job/38459270055#step:6:13)	from Crypto.PublicKey import RSA

--------------------------------------------------
>> Issue: [B4[13](https://github.com/weklund/ibind/actions/runs/13754274780/job/38459270055#step:6:14):blacklist] The pyCrypto library and its module RSA are no longer actively maintained and have been deprecated. Consider using pyca/cryptography library.
   Severity: High   Confidence: High
   CWE: CWE-327 (https://cwe.mitre.org/data/definitions/327.html)
   More Info: https://bandit.readthedocs.io/en/1.8.3/blacklists/blacklist_imports.html#b413-import-pycrypto
   Location: ./ibind/oauth/oauth1a.py:12:0
11	from Crypto.Hash import SHA256, HMAC, SHA1
12	from Crypto.PublicKey import RSA
13	from Crypto.Signature import PKCS1_v1_5 as PKCS1_v1_5_Signature

--------------------------------------------------
>> Issue: [B413:blacklist] The pyCrypto library and its module PKCS1_v1_5 are no longer actively maintained and have been deprecated. Consider using pyca/cryptography library.
   Severity: High   Confidence: High
   CWE: CWE-327 (https://cwe.mitre.org/data/definitions/327.html)
   More Info: https://bandit.readthedocs.io/en/1.8.3/blacklists/blacklist_imports.html#b413-import-pycrypto
   Location: ./ibind/oauth/oauth1a.py:13:0
12	from Crypto.PublicKey import RSA
13	from Crypto.Signature import PKCS1_v1_5 as PKCS1_v1_5_Signature
[14](https://github.com/weklund/ibind/actions/runs/13754274780/job/38459270055#step:6:15)

@Voyz
Copy link
Owner

Voyz commented Mar 10, 2025

hey @weklund thanks for expanding on these.

As for the logging singleton - I think I'm with you on the point that module just doesn't seem like an explicit entity that we use logically. Python treats them like these, and I'm accustomed to thinking this way already, but I agree that no only it is strange but also not very intuitive to embrace at first. Other arguments I think don't speak for either solution - it's all achievable in either system - but I'm inclined to give your suggestion a go. Would you like to give it a shot implementing the logging singleton?


As for pyCrypto - interesting! It is the library used by IBKR. Most of the OAuth code is 1-1 copied from the code they distribute when signing up for OAuth. If there's a better what to do it without pyCrypto, it would require us to rewrite the OAuth logic. I don't feel nowhere near competent enough in internet security to do this reliably.

If these were the two options:

  • A) Releasing IBind with possibly-risky and slightly deprecated code using pyCrypto that has been distributed by the authors of the API we work with
  • B) Releasing it with self-authored rewrite of that OAuth module, without an in-house understanding of their implementation.

Me not being knowledgeable in the topic and having to rely purely on trust of someone else doing a good job rewriting and later supporting that rewrite, I'd feel more confident with option A where we could just redirect all responsibility for any potential vulnerabilities to IBKR - at least until we have a larger group of established maintainers who could cross-validate that refactor. Hence, for now I think that anything related to pyCrypto we should just leave as it is until IBKR releases an update.

I know that a similar topic came up earlier when it came to changing some variables from upper case to lower case. Small changes like these sound fine, but I don't think that level of refactoring would be reasonable at the moment.

What was the issue again that we couldn't use the cryptography library?

It was only briefly introduced by Hugh (the author of OAuth 1.0a PR) in order to extract DH prime from a file automatically. I dropped it as I've had issues with reliably installing and distributing packages with cryptography in the past, and there is a much easier solution that I outlined in the OAuth wiki.

@salsasepp
Copy link

I'd feel more confident with option A where we could just redirect all responsibility for any potential vulnerabilities to IBKR

Maybe a first step could be to make sure IBKR is aware of the issue? Try to put this on their agenda?

If the worst case happens (i.e. an ibind[oauth] user gets their account hacked because of vulnerabilities in pyCrypto), it will be most difficult to prove that their code is to blame. Redirecting responsibility will only work if somebody is willing to take it.

@Voyz
Copy link
Owner

Voyz commented Mar 10, 2025

@salsasepp I agree it may be a good idea to let IBKR know about this.

Can you elaborate a bit on your second point? It's an interesting discussion your point spawns - possibly worth its own issue. My view is as follows:

Our responsibility certainly has its limit and whether someone else is willing to take it or not doesn't change it. I would say any parts that are copy pasted from IBKR fall outside of that scope. Another example of that is the documentation in the docstrings: almost all of it is verbatim from IBKR documentation, and I don't think it is my responsibility to ensure that they document their endpoints correctly. I think it's important to outline what this library is - an unofficial Python client wrapper around their API that makes its users' lives easier - and what it isn't - an attempt to fix their API and do their job for them.

If the worst case happens, why would we be needing to prove anything? This is an open source library and in no place we provide any guarantees regarding security. If someone submits an order and loses money, would we also need to prove it's not our code that did something wrong? I think it's clear to see it's not on us at all. And if I'm wrong and the truth is contrary, then I'd say we should drop OAuth support completely, as I don't think anyone here would like to take on the responsibility for others' accounts getting hacked. Don't you think think a case of "your account got hacked because IBKR publishes outdated code" is much easier for us to deal with than "your account got hacked because we badly implemented OAuth"?

Like I said, I'm not knowledgeable enough on the topic where I could publish the OAuth rewrite confident I did a better job than IBKR. If there are others here who can, I think there should be at least two active maintainers who could cross validate each other and stick around to provide maintenance to this module, as there I'd be more inclined to say that the responsibility falls partially on our side.

That being said, I think it may be fair to outline this in the OAuth wiki, that this code is provided as-is from IBKR and contains known vulnerabilities.

@salsasepp
Copy link

@salsasepp I agree it may be a good idea to let IBKR know about this.

Can you elaborate a bit on your second point? It's an interesting discussion your point spawns - possibly worth its own issue.

I believe I might have misunderstood your intention, and I was too brief in my answer, creating more misunderstanding. Apologies!

I believe ibind, and its maintainer and contributors are totally safe based upon the terms of the ibind LICENSE.

Let me rephrase my thought, from a trader's perspective: If ibind has vulnerabilities, because code copied verbatim from IBKR has vulnerabilities, and my account gets hacked because of said vulnerabilities, I believe it is highly unlikely that IBKR will take responsibility and compensate me. I am saying that, in this scenario, it does not help me in any way to redirect responsibility to IBKR. For lack of an official OAuth client free of known vulnerabilities, it is ultimately MY responsibility to either A) live with it, B) fix it, or C) stop using OAuth.

We're both leaning towards A) for the time being.

Don't you think think a case of "your account got hacked because IBKR publishes outdated code" is much easier for us to deal with than "your account got hacked because we badly implemented OAuth"?

From a contributor's perspective: yes. From a trader's perspective, it does not make any difference.

That being said, I think it may be fair to outline this in the OAuth wiki, that this code is provided as-is from IBKR and contains known vulnerabilities.

That is a good idea in any case, as well as opening a separate issue regarding possible pyCrypto replacement.

BTW, where does IBKR's OAuth implementation come from? Is the code publicly available? For me, there was no onboarding process, I just requested OAuth access and that was it.

@Voyz
Copy link
Owner

Voyz commented Mar 10, 2025

@salsasepp I understand your point more now, I appreciate you expanding.

BTW, where does IBKR's OAuth implementation come from? Is the code publicly available?

IBKR support sent it to one of the users before we started implementing OAuth. I guess they just give it out on case by case basis.

@weklund
Copy link
Author

weklund commented Mar 11, 2025

These are great thoughts! Here's the way I see it:

The moment we took code from another source (be it very likely the author of the API) and put it in Ibind's repo, Ibind took more ownership of the security model of Ibind's users to IB. I'm more inclined to be open to option B) in the future, and at least after contacting IB and seeing what they plan to do with the information about pyCrypto being depreciated.

Regardless of the 'blame/proof chain', Ibind should do everything within it's control to secure its user's data and operational integrity. A shared responsibility model if you will. This includes defining for users what Ibind's security scope is and what its controls are. They aren't 'guarantees' per se, but more like what parts of the execution flow Ibind will consider in it's security scope, and what is not.

As far as the risks for 'in-house' implementation, there's already risk just by supporting oauth. Because IB doesn't have a mechanism to update us if there's future changes to the implementation, the current code we have from them could still fail upon a new release of their implementation. Ibind should still provide this feature, just calling out that Ibind is already open to breaking changes. If that risk exists, I don't see it anymore risker to refactor their oauth to make it more secure and safer for Ibind users.

On the topic of confidence, I completely agree that we should not take this lightly, and should really take it to a different threat than this PR to align on what the expectations are if we were to do option B). I'm open to at least thinking about it, as I believe it's the right thing to do for Ibind users, regardless of the author of the code.

There's a few steps we can take, and seems like we're already aligned on a few:

  1. Notify IB of the risk for using pyCrypto
  2. Disclose this to Ibind users via wiki/readme
  3. Create a backlog item to refactor to using Cryptography at later date.

@weklund
Copy link
Author

weklund commented Mar 11, 2025

but I'm inclined to give your suggestion a go. Would you like to give it a shot implementing the logging singleton?

I'd love to! I'm going to make a todo for this to unblock this PR so I can revisit.

@weklund
Copy link
Author

weklund commented Mar 11, 2025

@Voyz I think we're getting close for this PR being ready! I would love another look at the full PR diff and see if there's anything we're missing.

Running the linter there's also a lot of files with small diffs that I haven't been checking in yet. I was considering that once we approve the substantive changes. It's nit things like removing unused imports, f-strings when there's no vars to pass in, etc.

@Voyz
Copy link
Owner

Voyz commented Mar 12, 2025

Thanks for sharing your views @weklund 👍 This indeed sounds like a discussion for a separate issue. I don't exclude the possibility of having it rewritten it in the future. I'll update the Wiki and message the IBKR in the meantime.

Copy link
Owner

@Voyz Voyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress altogether @weklund very cool initiative expanding the acronyms and adding meaningful inline comments 👏👏

I've added a few minor comments which should be easily fixable, other than that it looks great. Anything else that's pending for me to review other than the code?

]

ignore = [
"E501", # Ignore line length errors
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is resolved now?

@Voyz
Copy link
Owner

Voyz commented Mar 17, 2025

Hey @weklund the code looks great now, I'd say it's ready to merge 🙌

Is there anything else that's pending for me to review other than the code?

@lach1010
Copy link
Contributor

Hey @weklund the code looks great now, I'd say it's ready to merge 🙌

Is there anything else that's pending for me to review other than the code?

@weklund @Voyz well done some great improvements here!

* master:
  remaking session on ConnectionError only if self.use_session == True
@weklund
Copy link
Author

weklund commented Mar 18, 2025

Nope! It's ready! @Voyz I'm going to run the linter/formatter on this PR (lot of diffs incoming). Running it here so on the merge we don't have a failed CI - Let me know if anything looks off.

@Voyz
Copy link
Owner

Voyz commented Mar 18, 2025

hey @weklund I'm reviewing this recent change and it seems quite disrupting, and I don't think these rules are set in pep8

What didn't make sense:

  • Replaced all ' with ". I agree with keeping it consistent though, I always attempt to use ' on upper level and " on inner level, though admittedly I probably missed a few
  • Broke out function calls, class and function definitions and imports over several lines. This is probably the most disrupting one
  • Changed indentation on multi-line parameter definitions

Can we adjust the formatter's behaviour to avoid doing these?

What did make sense:

  • Removed some unnecessary white spaces

@weklund
Copy link
Author

weklund commented Mar 19, 2025

Ok! I'm glad I checked in the formatter changes in so we can take a look before merging :)

I'm having to burn midnight oil to get a feature for work in so apologizes for the delays between responses.

Replaced all ' with ". I agree with keeping it consistent though, I always attempt to use ' on upper level and " on inner level, though admittedly I probably missed a few

I have no preference personally, but I believe if we want consistently we probably want to configure ruff to either all single quotes or all double quotes. Don't believe it's context aware.

Broke out function calls, class and function definitions and imports over several lines. This is probably the most disrupting one

I believe this is max line length. Let me see what I can tweak here. Should I just consider things like function calls, imports can all go on one line?

Changed indentation on multi-line parameter definitions

My assumption here is that the formatter caught bad indentation and fixed. Do you have a good example?

@Voyz
Copy link
Owner

Voyz commented Mar 19, 2025

I'm having to burn midnight oil

Ha, never heard that expression! 😅 No worries at all, no rush to get this out atm

I have no preference personally, but I believe if we want consistently we probably want to configure ruff to either all single quotes or all double quotes. Don't believe it's context aware.

Sure, all single on outside and double on inside then.

I believe this is max line length. Let me see what I can tweak here. Should I just consider things like function calls, imports can all go on one line?

Ah right. Yeah, they can go on one line. I break them out when there's more than 3 parameters.

My assumption here is that the formatter caught bad indentation and fixed. Do you have a good example?

First thing I can find in that commit, from subscription_controller.modify_subscription, though any multi-line parameters got reformatted like this. It feels a bit arbitray.

    def modify_subscription(
            self,
            channel: str,
            status: bool = UNDEFINED,
            data: dict = UNDEFINED,
            needs_confirmation: bool = UNDEFINED,
            subscription_processor: SubscriptionProcessor = UNDEFINED,
        self,
        channel: str,
        status: bool = UNDEFINED,
        data: dict = UNDEFINED,
        needs_confirmation: bool = UNDEFINED,
        subscription_processor: SubscriptionProcessor = UNDEFINED,
    ):

Though admittedly maybe my IDE's one is arbitrary here...

Yes, I found a setting to disable this indentation in my IDE. I think this is one more for the 'What did make sense' list 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants