(RTOS_SDK) Memory Leak on mbedtls/lwip

gustavomassa
Posts: 14
Joined: Tue Feb 07, 2017 1:49 pm

(RTOS_SDK) Memory Leak on mbedtls/lwip

Postby gustavomassa » Wed May 24, 2017 7:48 am

User case : I'm connected to AWS IOT using mbedtls, I'm checking the internet connectivity opening a socket to google.com:80 and sending a byte every 15 seconds, also on each 4 minutes I have to close and re-open the socket, because google closes the peer after 4 minutes.
How to test: I go to my modem interface and I disconnect the adsl client for some seconds and then re-connect it again, after some seconds the google socket fails to send/reopen resulting on an internet connectivity failed event. Thus the network controller fully destroys the mbedtls, waits for the internet (google socket) and then re-connects the mbedtls again, so between each internet disconnection/re-connection I see a memory leak of about 400 bytes. NOTE: When I disconnect the wifi interface and reconnect it again, most of the memory leak is gone. The leak is related to the mbedtls or lwip.

I'm using the mbedtls-rtos-example code:
https://github.com/espressif/esp8266-rt ... edtls_demo

This is the code used for the internet connection test:

Code: Select all

#define SOCKET_HOST  ( ( const char * ) "google.com" )
#define SOCKET_PORT ( ( uint8 ) 80 )

static int socket_fd = -1;
static bool internetStatus = false;

LOCAL bool host2addr(const char *hostname , struct in_addr *in) {
    struct addrinfo hints, *servinfo, *p;
    struct sockaddr_in *h;
    memset(&hints, 0, sizeof(hints));
    hints.ai_family = AF_INET;
    hints.ai_socktype = SOCK_STREAM;
    if (lwip_getaddrinfo(hostname, 0 , &hints , &servinfo) != 0)
    {
        return false;
    }
    //loop through all the results and get the first resolve
    for (p = servinfo; p != 0; p = p->ai_next)
    {
        h = (struct sockaddr_in *)p->ai_addr;
        in->s_addr = h->sin_addr.s_addr;
    }
    lwip_freeaddrinfo(servinfo);
    return true;
}


LOCAL bool openGoogleSocket() {
    //create lwip TCP socket
    socket_fd = lwip_socket(AF_INET, SOCK_STREAM, 0);
    if( socket_fd != 0 )
    {
       IOT_ERROR("lwip_socket failed");
       lwip_close(socket_fd);
       internetStatus = false;
       return false;
    }
    //set socket force close
    lwip_force_close_set(1);
    //set TCP socket keepAlive
    uint32 opt = 0;
    opt = 1;
   if(lwip_setsockopt(socket_fd, SOL_SOCKET, SO_KEEPALIVE, &opt, sizeof(opt)) != 0) {
        IOT_ERROR("lwip_setsockop SO_KEEPALIVE failed");
      lwip_close(socket_fd);
        internetStatus = false;
        return false;
   }
    //set socket timeout
    opt = 10000;
   if(lwip_setsockopt(socket_fd, SOL_SOCKET, SO_RCVTIMEO, &opt, sizeof(opt)) != 0) {
        IOT_ERROR("lwip_setsockop SO_RCVTIMEO failed");
      lwip_close(socket_fd);
        internetStatus = false;
        return false;
   }

    //try to connect to google.com:80
    struct sockaddr_in addr;
    if (!host2addr(SOCKET_HOST, &(addr.sin_addr)))
    {
        IOT_ERROR("Invalid SOCKET_HOST: %s", SOCKET_HOST);
       lwip_close(socket_fd);
       internetStatus = false;
        return false;
    }
    addr.sin_family = AF_INET;
    addr.sin_port = htons(SOCKET_PORT);
    if( lwip_connect(socket_fd, (struct sockaddr*)&addr, sizeof(addr)) != 0 )
    {
        IOT_ERROR("lwip_connect failed");
       lwip_close(socket_fd);
        internetStatus = false;
        return false;
    }
    internetStatus = true;
    return true;
}

LOCAL void closeGoogleSocket() {
   lwip_force_close_set(1);
   lwip_close(socket_fd);
}


bool isInternetConnected() {
   if(!physicalStatus) {
      internetStatus = false;
      return false;
   }

   int error = 0;
   socklen_t len = sizeof (error);
   int retval = lwip_getsockopt (socket_fd, SOL_SOCKET, SO_ERROR, &error, &len);
   if (retval != 0) {
       /* there was a problem getting the error code */
      IOT_ERROR("lwip_getsockopt SO_ERROR failed");
      internetStatus = false;
      return false;
   }
   if (error != 0) {
       /* socket has a non zero error status */
      IOT_ERROR("%d", error);
      internetStatus = false;
      return false;
   }
   /* write p */
   if(lwip_write(socket_fd, "p", 1) != 1) {
      internetStatus = false;
      return false;
   }
   internetStatus = true;
   return true;
}


OBS: The RTOS_SDK mbedtls library "net" is using the original net.c file from ESP8266_RTOS_SDK\third_party\mbedtls\library, while the NON_OS is using the "espconn_mebdtls.c" wrapper that uses lwip behind. The RTOS_SDK mbedtls shouldn't be using the "espconn" wrapper instead of the original net file?

Thank you
Regards

donghengqaz
Posts: 5
Joined: Tue Jun 13, 2017 11:40 am

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby donghengqaz » Tue Jun 13, 2017 11:47 am

Hi,
Memory leaking may happen at first, but the total leaking memory bytes is constant(It means that it will not increase forever). This is the LWIP internal TCP management policy. So if the leaking memory don't continue to happen, it is normal and OK.

ememberus
Posts: 21
Joined: Thu May 04, 2017 12:53 am

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby ememberus » Tue Jun 13, 2017 8:12 pm

Should "between each internet disconnection/re-connection" already imply that it goes on and on forever?
Is there a reason to interpret it differently?

gustavomassa
Posts: 14
Joined: Tue Feb 07, 2017 1:49 pm

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby gustavomassa » Wed Jun 14, 2017 8:00 am

donghengqaz wrote:Hi,
Memory leaking may happen at first, but the total leaking memory bytes is constant(It means that it will not increase forever). This is the LWIP internal TCP management policy. So if the leaking memory don't continue to happen, it is normal and OK.


Hi donghengqaz,

In this case the leaking is happening forever, until the wifi(physical layer) disconnects and re-connects again or the device is restarted.
This is the scenario: mbedtls uses about 30kb of RAM, during the handshake, mbedtls uses about 17kb of RAM. So at every handshake I see a memory leak of 400bytes after an internet connectivity failed event. Therefore, lets say that we start the application with 20kb of memory available during the mbedtls handshake, then after about 7 network re-connections (internet connectivity and mbedtls connection to AWS), the application will run out of memory during the mbedtls handshake step, resulting in a critical system failure.
The application needs to be able to reconnect the network as many times as necessary without any kind of issues to guarantee stability and reliability.

donghengqaz
Posts: 5
Joined: Tue Jun 13, 2017 11:40 am

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby donghengqaz » Thu Jun 15, 2017 4:57 pm

Could you please supply your code to me ? If I have the code, I may handle with the problem better.

gustavomassa
Posts: 14
Joined: Tue Feb 07, 2017 1:49 pm

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby gustavomassa » Tue Jun 20, 2017 2:47 pm

hey donghengqaz,

Sorry about the delay, I was testing another options.

Here we go, first of all, I've changed from lwip TCP socket to lwip ICMP.

I've been using this example to implement the ping: https://github.com/goertzenator/lwip/bl ... ing/ping.c

First I tried to use the socket example, but for some reason the lwip fails when creating the socket SOCK_RAW type.

Code: Select all

lwip_socket(AF_INET, SOCK_RAW, IP_PROTO_ICMP)

Then I went through the RAW example, I was able to make it work, but the memory leak is even worse, about 280 bytes on every ping interval( 15 seconds in my case).

I'm sending ping to google: 8.8.8.8

Code: Select all

static ip_addr_t ping_target;
IP4_ADDR(&ping_target,8,8,8,8);


So you can use that example to debug and handle the problem, you just need to change the "ping_thread" to a freeRTOS task format and then call it using the "xTaskCreate" API.

Here is a part of my UART output showing the memory leak:

Code: Select all

861,385 @"511 - SENDING PING TO GOOGLE - H"
861,385 @"eap: 15752\n"
861,385 @"(ping_send) 158558927 - ALLOCATI"
861,385 @"NG PBUF SIZE: 40 - HEAP: 15752\n"
861,463 @"(ping_recv) 158657362 - Ping: re"
861,470 @"cv 90 id: 4368 seq_num: 4882 - H"
861,472 @"eap: 15528\n"
876,863 @"(network_controller_task) 174055"
876,884 @"435 - SENDING PING TO GOOGLE - H"
876,887 @"eap: 15592\n"
876,887 @"(ping_send) 174058845 - ALLOCATI"
876,888 @"NG PBUF SIZE: 40 - HEAP: 15592\n"
877,088 @"(ping_recv) 174281913 - Ping: re"
877,094 @"cv 210 id: 4368 seq_num: 4882 - "
877,097 @"Heap: 15368\n"
892,374 @"(network_controller_task) 189555"
892,374 @"424 - SENDING PING TO GOOGLE - H"
892,375 @"eap: 15432\n"
892,376 @"(ping_send) 189558839 - ALLOCATI"
892,381 @"NG PBUF SIZE: 40 - HEAP: 15432\n"
892,518 @"(ping_recv) 189712773 - Ping: re"
892,525 @"cv 140 id: 4368 seq_num: 4882 - "
892,527 @"Heap: 15208\n"
907,882 @"(network_controller_task) 205055"
907,882 @"421 - SENDING PING TO GOOGLE - H"
907,882 @"eap: 15240\n"
907,882 @"(ping_send) 205060480 - ALLOCATI"
907,882 @"NG PBUF SIZE: 40 - HEAP: 15240\n"
907,963 @"(ping_recv) 205157158 - Ping: re"
907,969 @"cv 90 id: 4368 seq_num: 4882 - H"
907,971 @"eap: 15016\n"
923,387 @"(network_controller_task) 220555"
923,387 @"424 - SENDING PING TO GOOGLE - H"
923,387 @"eap: 15080\n"
923,387 @"(ping_send) 220558839 - ALLOCATI"
923,387 @"NG PBUF SIZE: 40 - HEAP: 15080\n"
923,471 @"(ping_recv) 220656648 - Ping: re"
923,472 @"cv 90 id: 4368 seq_num: 4882 - H"
923,472 @"eap: 14856\n"


I've tried all the possible ways to avoid memory leaks, using a global pcb with global pbuf, allocating pcb and pbuf on each ping request and then destroying.. I've also added a lot of vTaskEnterCritical() and ExitCritical on the lwip calls, but it didn't change anything. All the ways goes to the same memory leak. It seems that this memory leak of RAW icmp implementation is the same memory leak related to the lwip TCP socket. So I really think that the problems is internal, probably some integration between the lwip and esp8266 network stack/tx/rx task.


Another thing that I noticed on the RAW implementation is about the "struct icmp_echo_hdr", when calling the function "ping_prepare_echo" and setting the ping ID and seq_num, it doesn't take any effect, the ping echo response has always the same ID and seq_num of "4368 and 4882". When the internet is disconnected, the ping_recv callback still receives a ping echo with a different ping ID and 0ms of response time.

moonyuan
Posts: 1
Joined: Wed Jun 21, 2017 11:11 am

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby moonyuan » Mon Jun 26, 2017 11:04 am

anyone hnow how to fix it? I also meet this issue. :?

Her Mary
Posts: 537
Joined: Mon Oct 27, 2014 11:09 am

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby Her Mary » Mon Jun 26, 2017 3:54 pm

Can you provide a test project? Fail to compile the source code mentioned above..

ESP_Faye
Posts: 1646
Joined: Mon Oct 27, 2014 11:08 am

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby ESP_Faye » Mon Aug 07, 2017 2:08 pm

Hi,

We have recently solved a memory leak issue, please have a try with the latest ESP8266_RTOS_SDK.

Thanks for your interest in ESP8266!

davydnorris
Posts: 9
Joined: Sat May 20, 2017 9:46 am

Re: (RTOS_SDK) Memory Leak on mbedtls/lwip

Postby davydnorris » Sat Aug 12, 2017 12:03 pm

@ESP_Faye

Will these fixes also be merged into the NONOS versions of the libs?

Would it make sense to have the libs in a separate repo that can be used as a subcomponent of both RTOS and NONOS SDK repos? Then a fix in one place will cover both?

Who is online

Users browsing this forum: No registered users and 3 guests