• caglararli@hotmail.com
  • 05386281520

Assessing the Implications of Using msoffcrypto for Open Sourcing a Medical Data Processing Pipeline

Çağlar Arlı      -    12 Views

Assessing the Implications of Using msoffcrypto for Open Sourcing a Medical Data Processing Pipeline

I am in the process of creating a Python pipeline intended for reading and processing sensitive medical personal data from password-protected Excel files. The pipeline utilizes the msoffcrypto library, specifically the OfficeFile.load_key and OfficeFile.decrypt functions, to handle the decryption of these files.

Given the nature of the data and the importance of maintaining privacy and security, I am seeking advice on open sourcing this pipeline. Here are the details and considerations:

The pipeline's primary function is to automate the processing of sensitive medical data stored in Excel files that are password protected for an extra layer of security. I am aware of the necessity to omit the actual passwords from any code or documentation I make public.

I have researched common security practices for open-source projects and have implemented code obfuscation where sensitive information might be included.

Before proceeding, I want to ensure I'm not overlooking any potential security or privacy issues that could arise from making the pipeline's code available to the public. Here is what I have considered and tried so far:

  • Password Management: Ensuring no hard-coded passwords are present in the codebase.
  • Code Review: Conducting thorough code reviews to check for any inadvertent inclusion of sensitive information.
  • Documentation: Preparing documentation that provides guidelines on how to securely use the pipeline without exposing sensitive data.
  • Security Audit: Planning to perform a security audit of the code to check for vulnerabilities.

My question is: Are there any other significant issues or best practices I should consider before open sourcing a tool of this nature? Furthermore, are there any specific aspects of the msoffcrypto library that I should be particularly cautious about in the context of open source?

Any guidance or insights on this matter would be greatly appreciated.