tag:status.cytora.com,2005:/historyCytora Status - Incident History2024-03-29T05:23:55+00:00Cytoratag:status.cytora.com,2005:Incident/90469412022-01-08T05:32:00+00:002022-01-10T17:14:27+00:00System outage due to production domain issue<p><small>Jan <var data-var='date'> 8</var>, <var data-var='time'>05:32</var> GMT</small><br><strong>Resolved</strong> - The outage was linked to issues with our production domain, which blocked users from logging in and interacting with the system. No data losses were expected.<br /><br />The issue has now been remedied with the fix of domain/certificates. All systems are back up and operational.</p>tag:status.cytora.com,2005:Incident/81021202021-09-28T18:40:58+01:002021-09-29T09:52:52+01:00Gateway outage<p><small>Sep <var data-var='date'>28</var>, <var data-var='time'>18:40</var> BST</small><br><strong>Resolved</strong> - Service has been restored.</p><p><small>Sep <var data-var='date'>28</var>, <var data-var='time'>17:21</var> BST</small><br><strong>Monitoring</strong> - A fix was implemented on Auth0's side and we are monitoring the results.</p><p><small>Sep <var data-var='date'>28</var>, <var data-var='time'>16:44</var> BST</small><br><strong>Identified</strong> - We have identified the issue, which is related to Auth0 (https://status.auth0.com/incidents/vwq8f1rcsdtf), our authentication service provider. We apologise for the impact, and our engineers are monitoring the ongoing resolution of the issue.</p><p><small>Sep <var data-var='date'>28</var>, <var data-var='time'>15:14</var> BST</small><br><strong>Investigating</strong> - We are currently investigating an issue with our gateway service, which leads to users unable to log in.</p>tag:status.cytora.com,2005:Incident/25973322019-06-24T14:51:45+01:002019-06-24T14:58:08+01:00DNS resolution problems<p><small>Jun <var data-var='date'>24</var>, <var data-var='time'>14:51</var> BST</small><br><strong>Resolved</strong> - Our monitoring has shown no issues over the past hour. We will continue monitoring the system to make sure the API requests are working as expected.</p><p><small>Jun <var data-var='date'>24</var>, <var data-var='time'>13:51</var> BST</small><br><strong>Monitoring</strong> - The riskengine.io domain resolution problem has been resolved by our DNS provider. We are continuing to monitor for any further issues.</p><p><small>Jun <var data-var='date'>24</var>, <var data-var='time'>12:21</var> BST</small><br><strong>Identified</strong> - Our DNS provider is experiencing some problems with the resolution of the riskengine.io domain. Some requests to riskengine.io might fail.</p>tag:status.cytora.com,2005:Incident/21067552018-12-19T16:52:22+00:002018-12-19T16:52:22+00:00Cytora API is not responding<p><small>Dec <var data-var='date'>19</var>, <var data-var='time'>16:52</var> GMT</small><br><strong>Resolved</strong> - The issue has been resolved, and our services are back to normal.</p><p><small>Dec <var data-var='date'>19</var>, <var data-var='time'>14:54</var> GMT</small><br><strong>Monitoring</strong> - The networking problem have been resolved by our cloud service provider. We are continuing to monitor for any further issues.</p><p><small>Dec <var data-var='date'>19</var>, <var data-var='time'>14:52</var> GMT</small><br><strong>Identified</strong> - We have identified the root cause of the problem was in the networking issues of our cloud service provider.</p><p><small>Dec <var data-var='date'>19</var>, <var data-var='time'>14:16</var> GMT</small><br><strong>Investigating</strong> - We are currently experiencing some issues with our preproduction and production environments and our team is investigating.</p>tag:status.cytora.com,2005:Incident/19653962018-10-15T18:12:31+01:002018-10-15T18:12:31+01:00Upgrade infrastructure in UAT and Production<p><small>Oct <var data-var='date'>15</var>, <var data-var='time'>18:12</var> BST</small><br><strong>Completed</strong> - The update has been completed, and our services are back to normal.</p><p><small>Oct <var data-var='date'>15</var>, <var data-var='time'>17:00</var> BST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Oct <var data-var='date'>12</var>, <var data-var='time'>10:12</var> BST</small><br><strong>Scheduled</strong> - We will be upgrading our underlying infrastructure's security and performance. We anticipate no more than 1hr of degraded service. During this time, some requests to our API could fail.</p>tag:status.cytora.com,2005:Incident/19277512018-09-24T16:58:38+01:002018-09-24T16:58:38+01:00Partial outage of Risk Engine production API<p><small>Sep <var data-var='date'>24</var>, <var data-var='time'>16:58</var> BST</small><br><strong>Resolved</strong> - Our monitoring has shown no issues over the past two hours. We will continue monitoring the system to make sure the API requests are working as expected.</p><p><small>Sep <var data-var='date'>24</var>, <var data-var='time'>15:23</var> BST</small><br><strong>Monitoring</strong> - Risk engine API cluster handles the requests as expected and we are monitoring it closely.</p><p><small>Sep <var data-var='date'>24</var>, <var data-var='time'>15:13</var> BST</small><br><strong>Investigating</strong> - We are observing that some of the requests to the risk engine API are failing and we are currently investigating the issue.</p>tag:status.cytora.com,2005:Incident/18745712018-09-04T14:36:23+01:002018-09-04T14:36:23+01:00Elasticsearch Degraded Performance<p><small>Sep <var data-var='date'> 4</var>, <var data-var='time'>14:36</var> BST</small><br><strong>Resolved</strong> - The degraded performance has been resolved by Elastic Cloud. We have been monitoring the performance for the past few days and everything is back to normal.</p><p><small>Aug <var data-var='date'>23</var>, <var data-var='time'>09:30</var> BST</small><br><strong>Identified</strong> - We use ElasticCloud as a hosted solution for Elasticsearch. This tool supports Cytora Address search engine and could affect the availability of GET addresses/autocomplete and POST invocations endpoints. However, based on our internal monitoring, these endpoints were available over 99.5% of the time in the last 24h.<br /><br />Original ElasticCloud status could be monitored on on https://cloud-status.elastic.co/.</p>tag:status.cytora.com,2005:Incident/18003112018-07-04T18:41:51+01:002018-07-04T18:41:51+01:00API response times slowed<p><small>Jul <var data-var='date'> 4</var>, <var data-var='time'>18:41</var> BST</small><br><strong>Resolved</strong> - Google has resolved their network issues, and our services are back to normal.</p><p><small>Jul <var data-var='date'> 4</var>, <var data-var='time'>12:56</var> BST</small><br><strong>Identified</strong> - Google Cloud is suffering network issues which is causing calls to our API to take up to 10x as long as expected. Some requests will time out. We're monitoring Google's status page (https://status.cloud.google.com/incident/compute/18007#18007002) and will post an update when we know more about the problem.</p>tag:status.cytora.com,2005:Incident/17749572018-06-19T12:00:46+01:002018-06-19T12:00:46+01:00Upgrading database redundancy in production<p><small>Jun <var data-var='date'>19</var>, <var data-var='time'>12:00</var> BST</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jun <var data-var='date'>19</var>, <var data-var='time'>11:01</var> BST</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Jun <var data-var='date'>18</var>, <var data-var='time'>15:41</var> BST</small><br><strong>Scheduled</strong> - We will be adding additional redundancy to our production databases. We anticipate no more than 1hr of degraded service. During this time, requests to create and update invocations could fail.</p>tag:status.cytora.com,2005:Incident/17460562018-05-29T14:46:12+01:002018-05-29T14:46:12+01:00Service interrupted<p><small>May <var data-var='date'>29</var>, <var data-var='time'>14:46</var> BST</small><br><strong>Resolved</strong> - The issue has been resolved, and our services are back to normal.</p><p><small>May <var data-var='date'>29</var>, <var data-var='time'>14:38</var> BST</small><br><strong>Monitoring</strong> - Our team has deployed a fix. We are continuing to monitor for any further issues.</p><p><small>May <var data-var='date'>29</var>, <var data-var='time'>14:37</var> BST</small><br><strong>Identified</strong> - We have identified the cause of the issue. Our team is working on a fix to bring our address search engine back online.</p><p><small>May <var data-var='date'>29</var>, <var data-var='time'>14:30</var> BST</small><br><strong>Investigating</strong> - We are currently experiencing downtime on some of our endpoints and our team is investigating.</p>tag:status.cytora.com,2005:Incident/17257362018-04-13T13:00:00+01:002018-05-14T11:56:44+01:00Elevated errors on policies endpoint following deployment<p><small>Apr <var data-var='date'>13</var>, <var data-var='time'>13:00</var> BST</small><br><strong>Resolved</strong> - Customers experienced an elevated error rate on our policies endpoint after API deployment. This could have affected the ability to create or retrieve policy records.<br /><br />The service has gradually recovered with latency and error rates returning to normal levels. This issue is now resolved.</p>tag:status.cytora.com,2005:Incident/17257082018-03-28T13:00:00+01:002018-05-14T19:52:19+01:00Elevated error rate following database upgrade<p><small>Mar <var data-var='date'>28</var>, <var data-var='time'>13:00</var> BST</small><br><strong>Resolved</strong> - We experienced an elevated number of errors and increased latency on our invocation endpoints after our database upgrade. Our engineers have fixed the database migration which caused the issue and all the endpoints work as expected now.</p>tag:status.cytora.com,2005:Incident/17257312018-02-22T12:00:00+00:002018-02-22T12:00:00+00:00High error in our API autocomplete services<p><small>Feb <var data-var='date'>22</var>, <var data-var='time'>12:00</var> GMT</small><br><strong>Resolved</strong> - We recorded an elevated number of HTTP 500-level response codes as a result of an increased number of active connections to our querying cluster. Clients may have also experienced higher latencies when accessing the API. Cluster resources have been upscaled, and the issue has been resolved.</p>