Update - We've limited the creation and upgrade of hot-warm deployments to the 7.x series until 7.1.1 is released due to their excessive IO and resulting instability.

The release process for Elasticsearch 7.1.1 is underway and is expected to be available early next week. When 7.1.1 is available, we will begin upgrading all existing 7.x hot-warm deployments in Cloud to that version. Customers will be able to create new 7.x series hot-warm clusters on 7.1.1+.

If you have a hot-warm deployment that is having performance issues (slow api calls or nodes dropping out of the cluster), please reach out to support@elastic.co for assistance.
May 24, 00:21 UTC
Update - We have begun the release process for Elasticsearch 7.1.1 and it will take a few days to complete. Once we have 7.1.1 ready, we will begin upgrading all 7.x hot-warm deployments in Cloud to that version. We will be limiting the creation and upgrade of hot-warm deployments to the 7.x series until 7.1.1 is released due to their excessive IO and resulting instability. If you have a hot-warm deployment that is having performance issues (slow api calls or nodes dropping out of the cluster), please reach out to support@elastic.co for assistance.

We will update you again when 7.1.1 is released.
May 23, 18:05 UTC
Update - We've confirmed that the changes made by the Elasticsearch team fix performance issues with hot-warm deployments. We're continuing to work closely with the Elasticsearch team to make this available to Cloud customers as soon as possible. We will update you again when we have a timeframe on when the release will be available on Cloud and available to download via elastic.co/downloads.
May 22, 06:15 UTC
Identified - After digging into the weeds of Linux file systems (IO schedulers, xfsslower tracing and friends), we suspect that the performance degradation is due to excessive fsyncs. In-order to support Cross Cluster Replication , Elasticsearch retains historical operations on certain indices. The amount of historical operations that are retained in an index is controlled by a new mechanism called retention leases. The leases are maintained by the primary copy of each shard and are synchronized to the replicas. With every synchronization, we issue an fsync to the file system to persist the file where the leases are stored. For simplicity, we are currently syncing the leases every 30 seconds. Sadly, for clusters which have a lot of shards and run on spinning disks (hello, warm nodes!) this creates a lot of fsyncs. Those numerous fsyncs appear to be causing heavy IO load on the machines and cause delays in persisting cluster state updates to disk. The delays can be so large that the new cluster coordination subsystem deems the nodes unstable and removes them from the cluster. These fsyncs only arise on indices created since 6.5 with a special index setting; to support future features, that setting is the default for indices created since 7.0. The Elasticsearch team has already created a pull request to fix this issue and we are currently working on confirming it in our staging Cloud envirionment. Once we have confirmed that the pull request fixes the issue, we will take the necessary next steps to get this fixed for all impacted Cloud users (and other users of Elasticsearch). We will update you again when we have confirmation of the fix (ETA 6 hours).
May 21, 21:22 UTC
Update - We've determined this issue is only affecting up to 5% of hot-warm deployments in both AWS and GCP based regions. Customers with hot-warm deployments experiencing any of the following symptoms are encouraged to contact support@elastic.co:
* Slow response times
* API calls timing out
* Temporarily unavailable shards on the warm tier
* New hot-warm deployments timing out during initial deployment

We'll update this incident within 24 hours, or when we determine an effective mitigation for the affected instances, and a longer term fix.
May 20, 23:52 UTC
Update - We're currently looking into short-term mitigations for the disk performance issues with warm-tier nodes in hot-warm deployments in AWS. We are also discussing potential longer term solutions for the disk performance. Further updates on this issue will come within the next 6 hours. If you are impacted by this issue or have questions related to the issue, please contact support@elastic.co.
May 20, 18:14 UTC
Update - We are continuing to investigate this issue.
May 20, 16:15 UTC
Investigating - We're currently experiencing performance issues with a small percentage of warm-tier nodes in hot-warm deployments in AWS. We have identified the issue and it seems to be related to disk performance on that server tier. Customers may experience the following symptoms: slow response times, timed out API calls, and temporarily unavailable shards on the warm tier, and new hot-warm deployments may time out during initial deployment. We are currently determining a long-term fix for the problem. If you are impacted by this issue, please contact support@elastic.co.
May 20, 15:48 UTC
Cluster Management Operational
Cluster Management Console Service Operational
Cluster Management API Operational
Cluster Orchestration ? Operational
Cluster Metrics Operational
Cluster Snapshots ? Operational
AWS Marketplace Operational
GCP us-central1 Operational
Cluster Connectivity: GCP us-central1 Operational
Kibana Connectivity: GCP us-central1 Operational
APM Connectivity: GCP us-central1 Operational
GCP us-west1 Operational
Cluster Connectivity: GCP us-west1 Operational
Kibana Connectivity: GCP us-west1 Operational
APM Connectivity: GCP us-west1 Operational
GCP europe-west1 Operational
Cluster Connectivity: GCP europe-west1 Operational
Kibana Connectivity: GCP europe-west1 Operational
APM Connectivity: GCP europe-west1 Operational
GCP europe-west3 Operational
Cluster Connectivity: GCP europe-west3 Operational
Kibana Connectivity: GCP europe-west3 Operational
APM Connectivity: GCP europe-west3 Operational
Google Cloud Platform ? Operational
Google Compute Engine ? Operational
Google Cloud Storage ? Operational
AWS N. Virginia (us-east-1) Operational
Cluster Connectivity: AWS us-east-1 Operational
AWS EC2 Health: us-east-1 ? Operational
Snapshot Storage Infrastructure (S3): us-east-1 ? Operational
Kibana Connectivity: AWS us-east-1 ? Operational
APM Connectivity: AWS us-east-1 Operational
AWS N. California (us-west-1) Operational
Cluster Connectivity: AWS us-west-1 Operational
AWS EC2 Health: us-west-1 ? Operational
Snapshot Storage Infrastructure (S3): us-west-1 ? Operational
Kibana Connectivity: AWS us-west-1 Operational
APM Connectivity: AWS us-west-1 Operational
AWS Ireland (eu-west-1) Operational
Cluster Connectivity: AWS eu-west-1 Operational
AWS EC2 Health: eu-west-1 ? Operational
Snapshot Storage Infrastructure (S3): eu-west-1 ? Operational
Kibana Connectivity: AWS eu-west-1 Operational
APM Connectivity: AWS eu-west-1 Operational
AWS Frankfurt (eu-central-1) Operational
Cluster Connectivity: AWS eu-central-1 Operational
AWS EC2 Health: eu-central-1 Operational
Snapshot Storage Infrastructure (S3): eu-central-1 Operational
Kibana Connectivity: AWS eu-central-1 Operational
APM Connectivity: AWS eu-central-1 Operational
AWS Oregon (us-west-2) Operational
Cluster Connectivity: AWS us-west-2 Operational
AWS EC2 Health: us-west-2 ? Operational
Snapshot Storage Infrastructure (S3): us-west-2 ? Operational
Kibana Connectivity: AWS us-west-2 Operational
APM Connectivity: AWS us-west-2 Operational
AWS São Paulo (sa-east-1) Operational
Cluster Connectivity: AWS sa-east-1 Operational
AWS EC2 Health: sa-east-1 ? Operational
Snapshot Storage Infrastructure (S3): sa-east-1 ? Operational
Kibana Connectivity: AWS sa-east-1 Operational
APM Connectivity: AWS sa-east-1 Operational
AWS Singapore (ap-southeast-1) Operational
Cluster Connectivity: AWS ap-southeast-1 Operational
AWS EC2 Health: ap-southeast-1 ? Operational
Snapshot Storage Infrastructure (S3): ap-southeast-1 ? Operational
Kibana Connectivity: AWS ap-southeast-1 Operational
APM Connectivity: AWS ap-southeast-1 Operational
AWS Sydney (ap-southeast-2) Operational
Cluster Connectivity: AWS ap-southeast-2 Operational
AWS EC2 Health: ap-southeast-2 ? Operational
Snapshot Storage Infrastructure (S3): ap-southeast-2 ? Operational
Kibana Connectivity: AWS ap-southeast-2 Operational
APM Connectivity: AWS ap-southeast-2 Operational
AWS Tokyo (ap-northeast-1) Operational
Cluster Connectivity: AWS ap-northeast-1 Operational
AWS EC2 Health: ap-northeast-1 ? Operational
Snapshot Storage Infrastructure (S3): ap-northeast-1 ? Operational
Kibana Connectivity: AWS ap-northeast-1 Operational
APM Connectivity: AWS ap-northeast-1 Operational
Heroku ? Operational
Elastic Maps Service ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Past Incidents
May 26, 2019

No incidents reported today.

May 25, 2019

No incidents reported.

May 21, 2019
Resolved - Due to a misconfigured deployment job, rather than removing a percentage of the proxies from the load balancer, the job removed all proxies from the load balancer. All traffic in aws-eu-central-1 (AWS Frankfurt) was impacted from Tues May 21 2019 20:09 to 20:15 UTC. The deployment job was manually updated to decrease deployment time. We're taking immediate action to prevent such a thing from happening again by not changing the default for this deployment job in the future and will add a second layer of verification when it changes. We apologize for the mistake.
May 21, 20:43 UTC
May 19, 2019

No incidents reported.

May 18, 2019

No incidents reported.

May 17, 2019

No incidents reported.

May 16, 2019

No incidents reported.

May 15, 2019

No incidents reported.

May 14, 2019

No incidents reported.

May 13, 2019

No incidents reported.

May 12, 2019

No incidents reported.