Local Discovery
Run discovery locally using YAML configuration files. This approach is ideal for automation, CI/CD pipelines, and advanced configurations.
Prerequisites
- Devgraph installed locally
- A Devgraph environment set up
- Access credentials for the systems you want to discover
Configure GitHub Discovery
Create a discovery configuration file discovery-config.yaml:
providers:
github:
# GitHub organization to scan
organization: "your-org-name"
# GitHub personal access token
# Create one at: https://github.com/settings/tokens
# Required scopes: repo, read:org
token: ${GITHUB_TOKEN}
# Optional: filter which repos to discover
# repository_filter: ".*-api$" # Only repos ending in -api
# Optional: discover teams and memberships
discover_teams: true
Store your GitHub token as an environment variable:
export GITHUB_TOKEN="ghp_your_token_here"
Run Discovery
Execute the discovery process:
# Run GitHub discovery
poetry run devgraph discover \
--config config.yaml \
--discovery-config discovery-config.yaml \
--environment {environment-id}
You should see output like:
🔍 Starting discovery for environment: my-org
📦 GitHub Provider: Discovering repositories...
✅ Found 42 repositories
📊 Creating entities...
✅ Created 42 Repository entities
👥 Discovering teams...
✅ Created 8 Team entities
🔗 Creating relationships...
✅ Created 156 relationships
✨ Discovery complete!
What Gets Discovered?
GitHub Repositories
For each repository, Devgraph creates:
- Repository entity with metadata (name, description, URL, language, etc.)
- Relationships to teams, topics, and dependencies
Example entity:
{
"apiVersion": "github.com/v1",
"kind": "Repository",
"metadata": {
"name": "my-service",
"namespace": "default",
"uid": "uuid-here",
"labels": {
"language": "python",
"visibility": "private"
}
},
"spec": {
"url": "https://github.com/your-org/my-service",
"description": "My awesome service",
"default_branch": "main",
"topics": ["api", "microservice"]
}
}
GitHub Teams (if enabled)
Teams are discovered with:
- Team name and description
- Members and their roles
- Repository permissions
Relationships
Devgraph automatically creates relationships:
Team→owns→RepositoryRepository→depends_on→Repository(from dependencies)Person→member_of→Team
Verify Discovery Results
Query the discovered entities:
# List all repositories
curl http://localhost:8000/api/entities \
-H "Devgraph-Environment: {environment-id}" \
-G --data-urlencode "kind=Repository"
# Search for a specific repository
curl http://localhost:8000/api/entities \
-H "Devgraph-Environment: {environment-id}" \
-G --data-urlencode "name=my-service"
# Get entity with relationships
curl http://localhost:8000/api/entities/{entity-uid} \
-H "Devgraph-Environment: {environment-id}"
Multiple Providers
You can configure multiple providers in the same file:
providers:
github:
organization: "your-org"
token: ${GITHUB_TOKEN}
gitlab:
url: "https://gitlab.com"
group: "your-group"
token: ${GITLAB_TOKEN}
docker:
registry: "docker.io"
namespace: "your-org"
username: ${DOCKER_USERNAME}
password: ${DOCKER_PASSWORD}
argo:
server: "https://argocd.example.com"
token: ${ARGO_TOKEN}
Run discovery for all providers:
poetry run devgraph discover \
--config config.yaml \
--discovery-config discovery-config.yaml \
--environment {environment-id}
Scheduling Regular Discovery
Discovery should run periodically to keep your ontology up-to-date. Set up a cron job:
# Run discovery every hour
0 * * * * cd /path/to/devgraph && poetry run devgraph discover --config config.yaml --discovery-config discovery-config.yaml --environment {environment-id}
Or use a task scheduler like Kubernetes CronJob:
apiVersion: batch/v1
kind: CronJob
metadata:
name: devgraph-discovery
spec:
schedule: "0 * * * *" # Every hour
jobTemplate:
spec:
template:
spec:
containers:
- name: discovery
image: devgraph:latest
command:
- devgraph
- discover
- --config
- /config/config.yaml
- --discovery-config
- /config/discovery-config.yaml
- --environment
- "$(ENVIRONMENT_ID)"
env:
- name: GITHUB_TOKEN
valueFrom:
secretKeyRef:
name: devgraph-secrets
key: github-token
Next Steps
Now that you've discovered your infrastructure:
- Learn about entity types and relationships
- Explore the Molecules Overview to add more integrations
Troubleshooting
No Entities Found
- Verify your token has correct permissions
- Check the organization/group name is correct
- Look for error messages in the logs
Rate Limiting
If you hit API rate limits:
- Use a token with higher limits (GitHub Apps have higher limits)
- Add delays between requests in the provider config
- Run discovery less frequently
Missing Relationships
Some relationships are created after multiple discovery runs:
- First run: Creates entities
- Second run: Discovers dependencies and creates relationships
- Run discovery twice for complete data