In one of the audio clips (in basics 2) the sentence is "Voces tem agua e sumo?". It sounds like the Portuguese speakers are saying "Voces tem **e** agua e sumo?" Is it a general rule that an extra e-sound is pronounced between a nasal and a stressed "a"? Or am I hearing things? Or is there some other reason for this?