As we reach the end of September 2024, ThreadFix version 3.x on-premises has officially reached its End-of-Life. Therefore, there is no longer support or updates for this version of the product. We have fully transitioned our product and development teams to focus ThreadFix SaaS and migrating all customers over from the on-premises versions. Our Customer Success and Support teams are here to help you in migrating to ThreadFix SaaS and maximizing the value you see from this improved offering from Coalfire. This is the next phase of ThreadFix and our team is looking forward to continuing to support you on this journey.
Configurations and Tuning Guide
You will learn
About the various service configurations and tuning options in ThreadFix.
Prerequisites
Audience: IT Professional
Difficulty: Advanced
Time needed: Approximately 25 minutes
Tools required: N/A
Table of Contents
General Recommendations
Recommendation | Keywords | Importance |
---|---|---|
For large scale deployments with more than a few thousand applications, contact ThreadFix Support for involvement in the deployment and configuration process. | new deployment, optimal configuration | High |
Prior to bringing the ThreadFix deployment down or restarting, allow scans in the ingestion pipeline to flush and finish processing without interruption. | restarting, upgrades, applying config changes, maintenance routines | Critical |
Monitor CPU/RAM average/peak utilization on the database and application servers during peak scan ingestion activity. Under-allocated resources can hinder performance and stability of the ingestion pipeline. If the database server resources aren’t enough, and increasing them isn’t option, consider scaling down scan ingestion services. Especially the data (writer) service. It’s recommended to start with slightly over-allocated resources. Allocation can be optimized after monitoring resource utilization in the environment under the typical/average activity and load. | resource allocation, utilization, and monitoring | Critical |
Scan/Application/Team delete jobs acquire global locks blocking other jobs while processing, so it’s recommended to run these jobs during scheduled maintenance hours. | scan ingestion throughput hindrance, maintenance routines | High |
Configuration Name/Detail | Services | |||
---|---|---|---|---|
Overview | ||||
| Importer Service | Vulnerability Ingestion Processor (VIP Service) | Data Writer Service | AppSec Core/Main Application |
| Contains 2 consumers that process the following: Raw Scan File Consumer Pending Scan Statues:
Remote Provider Import Request Consumer Remote Provider Import Request Statuses: For a bulk import, this request imports new scans for all mapped Remote Provider Apps sequentially.
Remote Provider Application Import Attempt Statuses
Pending Scan Statues:
Essentially both of these consumers produce a parsed and normalized scan which is stored to the staging storage (Minio) and some metadata to the database. | Consumes a parsed and normalized scan
Pending Scan Statues:
|
Pending Scan Statues:
|
|
~ Application Threads Increasing available CPU cores can lead to better for performance for services that have more processing threads/consumers. **Does not include the Kafka consumer’s background heartbeat thread(s). | ~2 | ~1 ** Does not include additional threads that can be utilized by Kafka's asynchronous producers used to produce Processed Finding and Vulnerability results to Kafka for Database ingestion. | ~20 |
|
Docker Compose service name/overrides | appsec-importer: | appsec-vip: | appsec-data: | |
K8 service name/overrides | ||||
Max Processing Time | ||||
This translates to the Kafka consumer max.poll.interval.ms configuration which dictates how long a message or job can take to process. Increasing this time may cause consumers/workers to take longer to rebalance, especially when scaling up and down in a busy system/full pipeline. | TF Default: 2 hours Consider increasing this to allow Remote Provider Bulk Imports Requests to run for longer periods. Allow importing scans for all configured and mapped Remote Provider Applications. If any connection configurations listed below apply: | TF Default: 6 minutes Consider increasing this if processing very large scans frequently and the “Processing” stage needs more time to successfully complete. | Kafka Default: 5 minutes | TF Default: 2 hours |
Docker Compose Env Config |
|
| Override the Kafka max.poll.interval.ms for this service if only truly necessary. Not recommended to increase, most operations by this services run quickly and efficiently and should not need more than a few seconds at most. | |
K8 Env Config | ||||
Kafka Partition Count Configs | ||||
The number of partitions for Kafka topics a service consumes from and produces messages/data to. Partitions allow for concurrency if the user would like to scale services to process concurrently. Important
| ||||
Docker Compose Env Config |
|
|
| |
K8 Env Config |
|
| ||
Scaling Guide | ||||
| When to scale? |
| ||
Important
| A. Remote Provider Imports Ideally the number of Importer Services and Remote Provider Import concurrency can match the number of Remote Provider connection configurations. Scaling beyond this number will likely result in idle services/consumers. A bulk import for a single Remote Provider connection configuration is picked up and processed by a single Importer Service. The Bulk Import job will sequentially import scans for each mapped app and drop them on the ingestion pipeline. B. Scan File Imports
| If reducing time for the following Pending Scan stages is desired:
| If reducing time for the following Pending Scan stages is desired:
Warnings If the database server resources aren’t enough, and increasing them isn’t an option, consider scaling down the number of data writer services. |
Limited to 1 |
Docker Compose Scaling Command |
|
|
| |
K8 Env Config |
References
www.threadfix.it | www.coalfire.com
Copyright © 2024 Coalfire. All rights reserved.
This Information Security Policy is CoalFire - Public: Distribution of this material is not limited.